Native American & Minority Ancestors Identified Using DNAPainter Plus Ethnicity Segments

Ethnicity is always a ticklish subject. On one hand we say to be leery of ethnicity estimates, but on the other hand, we all want to know who our ancestors were and where they came from. Many people hope to prove or disprove specific theories or stories about distant ancestors.

Reasons to be cautious about ethnicity estimates include:

  • Within continents, like Europe, it’s very difficult to discern ethnicity at the “country” level because of thousands of years of migration across regions where borders exist today. Ethnicity estimates within Europe can be significantly different than known and proven genealogy.
  • “Countries,” in Europe, political constructs, are the same size as many states in the US – and differentiation between those populations is almost impossible to accurately discern. Think of trying to figure out the difference between the populations of Indiana and Illinois, for example. Yet we want to be able to tell the difference between ancestors that came from France and Germany, for example.

Ethnicity states over Europe

  • All small amounts of ethnicity, even at the continental level, under 2-5%, can be noise and might be incorrect. That’s particularly true of trace amounts, 1% or less. However, that’s not always the case – which is why companies provide those small percentages. When hunting ancestors in the distant past, that small amount of ethnicity may be the only clue we have as to where they reside at detectable levels in our genome.

Noise in this case is defined as:

  • A statistical anomaly
  • A chance combination of your DNA from both parents that matches a reference population
  • Issues with the reference population itself, specifically admixture
  • Perhaps combinations of the above

You can read about the challenges with ethnicity here and here.

On the Other Hand

Having restated the appropriate caveats, on the other hand, we can utilize legitimate segments of our DNA to identify where our ancestors came from – at the continental level.

I’m actually specifically referring to Native American admixture which is the example I’ll be using, but this process applies equally as well to other minority or continental level admixture as well. Minority, in this sense means minority ethnicity to you.

Native American ethnicity shows distinctly differently from African and European. Sometimes some segments of DNA that we inherit from Native American ancestors are reported as Asian, specifically Siberian, Northern or Eastern Asian.

Remember that the Native American people arrived as a small group via Beringia, a now flooded land bridge that once connected Siberia with Alaska.

beringia map

By Erika Tamm et al – Tamm E, Kivisild T, Reidla M, Metspalu M, Smith DG, et al. (2007) Beringian Standstill and Spread of Native American Founders. PLoS ONE 2(9): e829. doi:10.1371/journal.pone.0000829. Also available from PubMed Central., CC BY 2.5, https://commons.wikimedia.org/w/index.php?curid=16975303

After that time, the Native American/First Nations peoples were isolated from Asia, for the most part, and entirely from Europe until European exploration resulted in the beginning of sustained European settlement, and admixture beginning in the late 1400s and 1500s in the Americas.

Family Inheritance

Testing multiple family members is extremely useful when working with your own personal minority heritage. This approach assumes that you’d like to identify your matches that share that genetic heritage because they share the same minority DNA that you do. Of course, that means you two share the same ancestor at some time in the past. Their genealogy, or your combined information, may hold the clue to identifying your ancestor.

In my family, my daughter has Native American segments that she inherited from me that I inherited from my mother.

Finding the same segment identified as Native American in several successive generations eliminates the possibility that the chance combination of DNA from your father and mother is “appearing” as Native, when it isn’t.

We can use segment information to our benefit, especially if we don’t know exactly who contributed that DNA – meaning which ancestor.

We need to find a way to utilize those Native or other minority segments genealogically.

23andMe

Today, the only DNA testing vendor that provides consumers with a segment identification of our ethnicity predictions is 23andMe.

If you have tested at 23andMe, sign in and click on Ancestry on the top tab, then select Ancestry Composition.

Minority ethnicity ancestry composition.png

Scroll down until you see your painted chromosomes.

Minority ethnicity chromosome painting.png

By clicking on the region at left that you want to see, the rest of the regions are greyed out and only that region is displayed on your chromosomes, at right.

Minority ethnicity Native.png

According to 23andMe, I have two Native segments, one each on chromosomes 1 and 2. They show these segments on opposite chromosomes, meaning one (the top for example) would be maternal or paternal, and the bottom one would be the opposite. But 23andMe apparently could not tell for sure because neither my mother nor father have tested there. This placement also turned out to be incorrect. The above image was my initial V3 test at 23andMe. My later V4 results were different.

Versions May Differ

Please note that your ethnicity predictions may be different based on which test you took which is dictated by when you took the test. The image above is my V3 test that was in use at 23andMe between 2010 and November 2013, and the image below is my V4 test in use between November 2013 and August 2017.

23andMe apparently does not correct original errors involving what is known as “strand swap” where the maternal and paternal segments are inverted during analysis. My V4 test results are shown below, where the strands are correctly portrayed.

Minority ethnicity Native V4.png

Note that both Native segments are now on the lower chromosome “side” of the pair and the position on the chromosome 1 segment has shifted visually.

Minority ethnicity sides.png

I have not tested at 23andMe on the current V5 GSA chip, in use since August 9, 2017, but perhaps I should. The results might be different yet, with the concept being that each version offers an improvement over earlier versions as science advances.

If your parents have tested, 23andMe makes adjustments to your ethnicity estimates accordingly.

Although my mother can’t test at 23andMe, I happen to already know that these Native segments descend from my mother based on genealogical and genetic analysis, combined. I’m going to walk you through the process.

I can utilize my genealogy to confirm or refute information shown by 23andMe. For example, if one of those segments comes from known ancestors who were living in Germany, it’s clearly not Native, and it’s noise of some type.

We’re going to utilize DNAPainter to determine which ancestors contributed your minority segments, but first you’ll need to download your ethnicity segments from 23andMe.

Downloading Ethnicity Segment Data

Downloading your ethnicity segments is NOT THE SAME as downloading your raw DNA results to transfer to another vendor. Those are two entirely different files and different procedures.

To download the locations of your ethnicity segments at 23andMe, scroll down below your painted ethnicity segments in your Ancestry Composition section to “View Scientific Details.”

MInority ethnicity scientific details.png

Click on View Scientific Details and scroll down to near the bottom and then click on “Download Raw Data.” I leave mine at the 50% confidence level.

Minority ethnicity download raw data.png

Save this spreadsheet to your computer in a known location.

In the spreadsheet, you’ll see columns that provide the name of the segment, the chromosome copy number (1 or 2) and the chromosome number with start and end locations.

Minority ethnicity download.png

You really don’t care about this information directly, but DNAPainter does and you’ll care a lot about what DNAPainter does for you.

DNAPainter

I wrote introductory articles about DNAPainter:

If you’re not familiar with DNAPainter, you might want to read these articles first and then come back to this point in this article.

Go ahead – I’ll wait!

Getting Started

If you don’t have a DNAPainter account, you’ll need to create one for free. Some features, such as having multiple profiles are subscription based, but the functionality you’ll need for one profile is free.

I’ve named this example profile “Ethnicity Demo.” You’ll see your name where mine says “Ethnicity Demo.”

Minority ethnicity DNAPainter.png

Click on “Import 23andme ancestry composition.”

You will copy and paste all the spreadsheet rows in the entire downloaded 23andMe ethnicity spreadsheet into the DNAPainter text box and make your selection, below. The great news is that if you discover that your assumption about copy 1 being maternal or paternal is incorrect, it’s easy to delete the ethnicity segments entirely and simply repaint later. Ditto if 23andMe changes your estimate over time, like they have mine.

Minority ethnicity DNAPainter sides.png

I happen to know that “copy 2” is maternal, so I’ve made that selection.

You can then see your ethnicity chromosome segments painted, and you can expand each one to see the detail. Click on “Save Segments.”

MInority ethnicity DNAPainter Native painting

Click to enlarge

In this example, you can see my Native segments, called by various names at different confidence levels at 23andMe, on chromosome 1.

Depending on the confidence level, these segments are called some mixture of:

  • East Asian & Native American
  • North Asian & Native American
  • Native American
  • Broadly East Asian & Native American

It’s exactly the same segment, so you don’t really care what it’s called. DNAPainter paints all of the different descriptions provided by 23andMe, at all confidence levels as you can see above.

The DNAPainter colors are different from 23andMe colors and are system-selected. You can’t assign the colors for ethnicity segments.

Now, I’m moving to my own profile that I paint with my ancestral segments. To date, I have 78% of my segments painted by identifying cousins with known common ancestors.

On chromosomes 1 and 2, copy 2, which I’ve determined to be my mother’s “side,” these segments track back to specific ancestors.

Minority ethnicity maternal side

Click to enlarge

Chromosome 1 segments, above, track back to the Lore family, descended from Antoine (Anthony) Lore (Lord) who married Rachel Hill. Antoine Lore was Acadian.

Minority ethnicity chromosome 1.png

Clicking on the green segment bar shows me the ancestors I assigned when I painted the match with my Lore family member whose name is blurred, but whose birth surname was Lore.

The Chromosome 2 segment, below, tracks back to the same family through a match to Fred.

Minority ethnicity chromosome 2.png

My common ancestors with Fred are Honore Lore and Marie Lafaille who are the parents of Antoine Lore.

Minority ethnicity common ancestor.png

There are additional matches on both chromosomes who also match on portions of the Native segments.

Now that I have a pointer in the ancestral direction that these Native American segments arrived from, what can traditional genealogy and other DNA information tell me?

Traditional Genealogy Research

The Acadian people were a mixture of English, French and Native American. The Acadians settled on the island of Nova Scotia in 1609 and lived there until being driven out by the English in 1755, roughly 6 or 7 generations later.

Minority ethnicity Acadian map.png

The Acadians intermarried with the Mi’kmaq people.

It had been reported by two very qualified genealogists that Philippe Mius, born in 1660, married two Native American women from the Mi’kmaq tribe given the name Marie.

The French were fond of giving the first name of Marie to Native women when they were baptized in the Catholic faith which was required before the French men were allowed to marry the Native women. There were many Native women named Marie who married European men.

Minority ethnicity Native mitochondrial tree

Click to enlarge

This Mius lineage is ancestral to Antoine Lore (Lord) as shown on my pedigree, above.

Mitochondrial DNA has revealed that descendants from one of Philippe Mius’s wives, Marie, carry haplogroup A2f1a.

However, mitochondrial tests of other descendants of “Marie,” his first wife, carry haplogroup X2a2, also Native American.

Confusion has historically existed over which Marie is the mother of my ancestor, Francoise.

Karen Theroit Reader, another professional genealogist, shows Francoise Mius as the last child born to the first Native wife before her death sometime after 1684 and before about 1687 when Philippe remarried.

However, relative to the source of Native American segments, whether Francoise descends from the first or second wife doesn’t matter in this instance because both are Native and are proven so by their mitochondrial DNA haplogroups.

Additionally, on Antoine’s mother’s side, we find a Doucet male, although there are two genetic male Doucet lines, one of European origin, haplogroup R-L21, and one, surprisingly, of Native origin, haplogroup C-P39. Both are proven by their respective haplogroups but confusion exists genealogically over who descends from which lineage.

On Antoine’s mother’s side, there are several unidentified lineages, any one or multiples of which could also be Native. As you can see, there are large gaps in my tree.

We do know that these Native segments arrived through Antoine Lore and his parents, Honore Lore and Marie LaFaille. We don’t know exactly who upstream contributed these segments – at least not yet. Painting additional matches attributable to specific ancestral couples will eventually narrow the candidates and allow me to walk these segments back in time to their rightful contributor.

Segments, Traditional Research and DNAPainter

These three tools together, when using continent-level segments in combination with painting the DNA segments of known cousins that match specific lineages create a triangulated ethnicity segment.

When that segment just happens to be genealogically important, this combination can point the researchers in the right direction knowing which lines to search for that minority ancestor.

If your cousins who match you on this segment have also tested with 23andMe, they should also be identified as Native on this same segment. This process does not apply to intracontinental segments, meaning within Europe, because the admixture is too great and the ethnicity predictions are much less reliable.

When identifying minority admixture at the continental level, adding Y and mitochondrial DNA testing to the mix in order to positively identify each individual ancestor’s Y and mitochondrial DNA is very important in both eliminating and confirming what autosomal DNA and genealogy records alone can’t do. The base haplogroup as assigned at 23andMe is a good start, but it’s not enough alone. Plus, we only carry one line of mitochondrial DNA and only males carry Y DNA, and only their direct paternal line.

We need Y and mitochondrial DNA matching at FamilyTreeDNA to verify the specific lineage. Additionally, we very well may need the Y and mitochondrial DNA information that we don’t directly carry – but other cousins do. You can read about Y and mitochondrial DNA testing, here.

I wrote about creating a personal DNA pedigree chart including your ancestors’ Y and mitochondrial DNA here. In order to find people descended from a specific ancestor who have DNA tested, I utilize:

  • WikiTree resources and trees
  • Geni trees
  • FamilySearch trees
  • FamilyTreeDNA autosomal matches with trees
  • AncestryDNA autosomal matches and their associated trees
  • Ancestry trees in general, meaning without knowing if they are related to a DNA match
  • MyHeritage autosomal matches and their trees
  • MyHeritage trees in general

At both MyHeritage and Ancestry, you can view the trees of your matches, but you can also search for ancestors in other people’s trees to see who might descend appropriately to provide a Y or mitochondrial DNA sample. You will probably need a subscription to maximize these efforts. My Heritage offers a free trial subscription here.

If you find people appropriately descended through WikiTree, Geni or FamilySearch, you’ll need to discuss DNA testing with them. They may have already tested someplace.

If you find people who have DNA tested through your DNA matches with trees at Ancestry and MyHeritage, you’ll need to offer a Y or mitochondrial DNA test to them if they haven’t already tested at FamilyTreeDNA.

FamilyTreeDNA is the only vendor who provides the Y DNA and mitochondrial DNA tests at the higher resolution level, beyond base haplogroups, required for matching and for a complete haplogroup designation.

If the person has taken the Family Finder autosomal test at FamilyTreeDNA, they may have already tested their Y DNA and mtDNA, or you can offer to upgrade their test.

Projects

Checking projects at FamilyTreeDNA can be particularly useful when trying to discover if anyone from a specific lineage has already tested. There are many, special interest projects such as the Acadian AmerIndian Ancestry project, the American Indian project, haplogroup projects, surname projects and more.

You can view projects alphabetically here or you can click here to scroll down to enter the surname or topic you are seeking.

Minority ethnicity project search.png

If the topic isn’t listed, check the alphabetic index under Geographical Projects.

23andMe Maternal and Paternal Sides

If possible, you’ll want to determine which “side” of your family your minority segments originate come from, unless they come from both. you’ll want to determine whether chromosome side one 1 or 2 is maternal, because the other one will be paternal.

23andMe doesn’t offer tree functionality in the same way as other vendors, so you won’t be able to identify people there descended from your ancestors without contacting each person or doing other sleuthing.

Recently, 23andMe added a link to FamilySearch that creates a list of your ancestors from their mega-shared tree for 7 generations, but there is no tree matching or search functionality. You can read about the FamilySearch connection functionality here.

So, how do you figure out which “side” is which?

Minority ethnicity minority segment.png

The chart above represents the portion of your chromosomes that contains your minority ancestry. Initially, you don’t know if the minority segment is your mother’s pink chromosome or your father’s blue chromosome. You have one chromosome from each parent with the exact same addresses or locations, so it’s impossible to tell which side is which without additional information. Either the pink or the blue segment is minority, but how can you tell?

In my case, the family oral history regarding Native American ancestry was from my father’s line, but the actual Native segments wound up being from my mother, not my father. Had I made an assumption, it would have been incorrect.

Fortunately, in our example, you have both a maternal and paternal aunt who have tested at 23andMe. You match both aunts on that exact same segment location – one from your father’s side, blue, and one from your mother’s side, pink.

You compare your match with your maternal aunt and verify that indeed, you do match her on that segment.

You’ll want to determine if 23andMe has flagged that segment as Native American for your maternal aunt too.

You can view your aunt’s Ancestry Composition by selecting your aunt from the “Your Connections” dropdown list above your own ethnicity chromosome painting.

Minority ethnicity relative connections.png

You can see on your aunt’s chromosomes that indeed, those locations on her chromosomes are Native as well.

Minority ethnicity relative minority segments.png

Now you’ve identified your minority segment as originating on your maternal side.

Minority ethnicity Native side.png

Let’s say you have another match, Match 1, on that same segment. You can easily tell which “side” Match 1 is from. Since you know that you match your maternal aunt on that minority segment, if Match 1 matches both you and your maternal aunt, then you know that’s the side the match is from – AND that person also shares that minority segment.

You can also view that person’s Ancestry Composition as well, but shared matching is more reliable,especially when dealing with small amounts of minority admixture.

Another person, Match 2, matches you on that same segment, but this time, the person matches you and your paternal aunt, so they don’t share your minority segment.

Minority ethnicity match side.png

Even if your paternal aunt had not tested, because Match 2 does not match you AND your maternal aunt, you know Match 2 doesn’t share your minority segment which you can confirm by checking their Ancestry Composition.

Download All of Your Matches

Rather than go through your matches one by one, it’s easiest to download your entire match list so you can see which people match you on those chromosome locations.

Minority ethnicity download aggregate data.png

You can click on “Download Aggregate Data” at 23andMe, at the bottom of your DNA Relatives match list to obtain all of your matches who are sharing with you. 23andMe limits your matches to 2000 or less, the actual number being your highest 2000 matches minus the people who aren’t sharing. I have 1465 matches showing and that number decreases regularly as new testers at 23andMe are focused on health and not genealogy, meaning lower matches get pushed off the list of 2000 match candidates.

You can quickly sort the spreadsheet to see who matches you on specific segments. Then, you can check each match in the system to see if that person matches you and another known relative on the minority segments or you can check their Ancestry Composition, or both.

If they share your minority segment, then you can check their tree link if they have one, included in the download, their Family Search information if included on their account, or reach out to them to see if you might share a known ancestor.

The key to making your ethnicity segment work for you is to identify ancestors and paint known matches.

Paint Those Matches

When searching for matches whose DNA you can attribute to specific ancestors, be sure to check at all 4 places that provide segment information that you can paint:

At GedMatch, you’ll find some people who have tested at the other various vendors, including Ancestry, but unfortunately not everyone uploads. Ancestry doesn’t provide segment information, so you won’t be able to paint those matches directly from Ancestry.

If your Ancestry matches transfer to GedMatch, FamilyTreeDNA or MyHeritage you can view your match and paint your common segments. At GedMatch, Ancestry kit numbers begin with an A. I use my Ancestry kit matches at GedMatch to attempt to figure out who that match is at Ancestry in order to attempt to figure out the common ancestor.

To Paint, You Must Test

Of course, in order to paint your matches that you find in various databases, you need to be in those data bases, meaning you either need to test there or transfer your DNA file.

Transfers

If you’d like to test your DNA at one vendor and download the file to transfer to another vendor, or GedMatch, that’s possible with both FamilyTreeDNA and MyHeritage who both accept uploads.

You can transfer kits from Ancestry and 23andMe to both FamilyTreeDNA and MyHeritage for free, although the chromosome browsers, advanced tools and ethnicity require an unlock fee (or alternatively a subscription at MyHeritage). Still, the free transfer and unlock for $19 at FamilyTreeDNA or $29 at MyHeritage is less than the cost of testing.

Here’s a quick cheat sheet.

DNA vendor transfer cheat sheet 2019

From time to time, as vendor file formats change, the ability to transfer is temporarily interrupted, but it costs nothing to try a transfer to either MyHeritage or FamilyTreeDNA, or better yet, both.

In each of these articles, I wrote about how to download your data from a specific vendor and how to upload from other vendors if they accept uploads.

Summary Steps

In order to use your minority ethnicity segments in your genealogy, you need to:

  1. Test at 23andMe
  2. Identify which parental side your minority ethnicity segments are from, if possible
  3. Download your ethnicity segments
  4. Establish a DNAPainter account
  5. Upload your ethnicity segments to DNAPainter
  6. Paint matches of people with whom you share known common ancestors utilizing segment information from 23andMe, FamilyTreeDNA, MyHeritage and AncestryDNA matches who have uploaded to GedMatch
  7. If you have not tested at either MyHeritage or FamilyTreeDNA, upload your 23andMe file to either vendor for matching, along with GedMatch
  8. Focus on those minority segments to determine which ancestral line they descend through in order to identify the ancestor(s) who provided your minority admixture.

Have fun!

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

First Steps When Your DNA Results are Ready – Sticking Your Toe in the Genealogy Water

First steps helix

Recently someone asked me what the first steps would be for a person who wasn’t terribly familiar with genealogy and had just received their DNA test results.

I wrote an article called DNA Results – First Glances at Ethnicity and Matching which was meant to show new folks what the various vendor interfaces look like. I was hoping this might whet their appetites for more, meaning that the tester might, just might, stick their toe into the genealogy waters😊

I’m hoping this article will help them get hooked! Maybe that’s you!

A Guide

This article can be read in one of two ways – as an overview, or, if you click the links, as a pretty thorough lesson. If you’re new, I strongly suggest reading it as an overview first, then a second time as a deeper dive. Use it as a guide to navigate your results as you get your feet wet.

I’ll be hotlinking to various articles I’ve written on lots of topics, so please take a look at details (eventually) by clicking on those links!

This article is meant as a guideline for what to do, and how to get started with your DNA matching results!

If you’re looking for ethnicity information, check out the First Glances article, plus here and here and here.

Concepts – Calculating Ethnicity Percentages provides you with guidelines for how to estimate your own ethnicity percentages based on your known genealogy and Ethnicity Testing – A Conundrum explains how ethnicity testing is done.

OK, let’s get started. Fun awaits!

The Goal

The goal for using DNA matching in genealogy depends on your interests.

  1. To discover cousins and family members that you don’t know. Some people are interested in finding and meeting relatives who might have known their grandparents or great-grandparents in the hope of discovering new family information or photos they didn’t know existed previously. I’ve been gifted with my great-grandparent’s pictures, so this strategy definitely works!
  2. To confirm ancestors. This approach presumes that you’ve done at least a little genealogy, enough to construct at least a rudimentary tree. Ancestors are “confirmed” when you DNA match multiple other people who descend from the same ancestor through multiple children. I wrote an article, Ancestors: What Constitutes Proof?, discussing how much evidence is enough to actually confirm an ancestor. Confirmation is based on a combination of both genealogical records and DNA matching and it varies depending on the circumstances.
  3. Adoptees and people with unknown parents seeking to discover the identities of those people aren’t initially looking at their own family tree – because they don’t have one yet. The genealogy of others can help them figure out the identity of those mystery people. I wrote about that technique in the article, Identifying Unknown Parents and Individuals Using DNA Matching.

DNAAdoption for Everyone

Educational resources for adoptees and non-adoptees alike can be found at www.dnaadoption.org. DNAAdoption is not just for adoptees and provides first rate education for everyone. They also provide trained and mentored search angels for adoptees who understand the search process along with the intricacies of navigating the emotional minefield of adoption and unknown parent searches.

First Look” classes for each vendor are free for everyone at DNAAdoption and are self-paced, downloadable onto your computer as a pdf file. Intro to DNA, Applied Autosomal DNA and Y DNA Basics classes are nominally priced at between $29 and $49 and I strongly recommend these. DNAAdoption is entirely non-profit, so your class fee or contribution supports their work. Additional resources can be found here and their 12 adoptee search steps here.

Ok, now let’s look at your results.

Matches are the Key

Regardless of your goal, your DNA matches are the key to finding answers, whether you want to make contact with close relatives, prove your more distant ancestors or you’re involved in an adoptee or unknown parent search.

Your DNA matches that of other people because each of you inherited a piece of DNA, called a segment, where many locations are identical. The length of that DNA segment is measured in centiMorgans and those locations are called SNPs, or single nucleotide polymorphisms. You can read about the definition of a centimorgan and how they are used in the article Concepts – CentiMorgans, SNPs and Pickin’Crab.

While the scientific details are great, they aren’t important initially. What is important is to understand that the more closely you match someone, the more closely you are related to them. You share more DNA with close relatives than more distant relatives.

For example, I share exactly half of my mother’s DNA, but only about 25% of each of my grandparents’ DNA. As the relationships move further back in time, I share less and less DNA with other people who descend from those same ancestors.

Informational Tools

Every vendor’s match page looks different, as was illustrated in the First Glances article, but regardless, you are looking for four basic pieces of information:

  • Who you match
  • How much DNA you share with your match
  • Who else you and your match share that DNA with, which suggests that you all share a common ancestor
  • Family trees to reveal the common ancestor between people who match each other

Every vendor has different ways of displaying this information, and not all vendors provide everything. For example, 23andMe does not support trees, although they allow you to link to one elsewhere. Ancestry does not provide a tool called a chromosome browser which allows you to see if you and others match on the same segment of DNA. Ancestry only tells you THAT you match, not HOW you match.

Each vendor has their strengths and shortcomings. As genealogists, we simply need to understand how to utilize the information available.

I’ll be using examples from all 4 major vendors:

Your matches are the most important information and everything else is based on those matches.

Family Tree DNA

I have tested many family members from both sides of my family at Family Tree DNA using the Family Finder autosomal test which makes my matches there incredibly useful because I can see which family members, in addition to me, my matches match.

Family Tree DNA assigns matches to maternal and paternal sides in a unique way, even if your parents haven’t tested, so long as some close relatives have tested. Let’s take a look.

First Steps Family Tree DNA matches.png

Sign on to your account and click to see your matches.

At the top of your Family Finder matches page, you’ll see three groups of things, shown below.

First Steps Family Tree DNA bucketing

Click to enlarge

A row of tools at the top titled Chromosome Browser, In Common With and Not in Common With.

A second row of tabs that include All, Paternal, Maternal and Both. These are the maternal and paternal tabs I mentioned, meaning that I have a total of 4645 matches, 988 of which are from my paternal side and 847 of which are from my maternal side.

Family Tree DNA assigns people to these “buckets” based on matches with third cousins or closer if you have them attached in your tree. This is why it’s critical to have a tree and test close relatives, especially people from earlier generations like aunts, uncles, great-aunts/uncles and their children if they are no longer living.

If you have one or both parents that can test, that’s a wonderful boon because anyone who matches you and one of your parents is automatically bucketed, or phased (scientific term) to that parent’s side of the tree. However, at Family Tree DNA, it’s not required to have a parent test to have some matches assigned to maternal or paternal sides. You just need to test third cousins or closer and attach them to the proper place in your tree.

How does bucketing work?

Maternal or Paternal “Side” Assignment, aka Bucketing

If I match a maternal first cousin, Cheryl, for example, and we both match John Doe on the same segment, John Doe is automatically assigned to my maternal bucket with a little maternal icon placed beside the match.

First Steps Family Tree DNA match info

Click to enlarge

Every vendor provides an estimated or predicted relationship based on a combination of total centiMorgans and the longest contiguous matching segment. The actual “linked relationship” is calculated based on where this person resides in your tree.

The common surnames at far right are a very nice features, but not every tester provides that information. When the testers do include surnames at Family Tree DNA, common surnames are bolded. Other vendors have similar features.

People with trees are shown near their profile picture with a blue pedigree icon. Clicking on the pedigree icon will show you their ancestors. Your matches estimated relationship to you indicates how far back you should expect to share an ancestor.

For example, first cousins share grandparents. Second cousins share great-grandparents. In general, the further back in time your common ancestor, the less DNA you can be expected to share.

You can view relationship information in chart form in my article here or utilize DNAPainter tools, here, to see the various possibilities for the different match levels.

Clicking on the pedigree chart of your match will show you their tree. In my tree, I’ve connected my parents in their proper places, along with Cheryl and Don, mother’s first cousins. (Yes, they’ve given permission for me to utilize their results, so they aren’t always blurred in images.)

Cheryl and Don are my first cousins once removed, meaning my mother is their first cousin and I’m one generation further down the tree. I’m showing the amount of DNA that I share with each of them in red in the format of total DNA shared and longest unbroken segment, taken from the match list. So 382-53 means I share a total of 382 cM and 53 cM is the longest matching block.

First Steps Family Tree DNA tree.png

The Chromosome Browser

Utilizing the chromosome browser, I can see exactly where I match both Don and Cheryl. It’s obvious that I match them on at least some different pieces of my DNA, because the total and longest segment amounts are different.

The reason it’s important to test lots of close relatives is because even siblings inherit different pieces of DNA from their parents, and they don’t pass the same DNA to their offspring either – so in each generation the amount of shared DNA is probably reduced. I say probably because sometimes segments are passed entirely and sometimes not at all, which is how we “lose” our ancestors’ DNA over the generations.

Here’s a matching example utilizing a chromosome browser.

First Steps Family Tree DNA chromosome browser.png

I clicked the checkboxes to the left of both Cheryl and Don on the match page, then the Chromosome Browser button, and now you can see, above, on chromosomes 1-16 where I match Cheryl (blue) and Don (red.)

In this view, both Don and Cheryl are being compared to me, since I’m the one signed in to my account and viewing my DNA matches. Therefore, one of the bars at each chromosome represents Don’s DNA match to me and one represents Cheryl’s. Cheryl is the first person and Don is the second. Person match colors (red and blue) are assigned arbitrarily by the system.

My grandfather and Cheryl/Don’s father, Roscoe, were siblings.

You can see that on some segments, my grandfather and Roscoe inherited the same segment of DNA from their parents, because today, my mother gave me that exact same segment that I share with both Don and Cheryl. Those segments are exactly identical and shown in the black boxes.

The only way for us to share this DNA today is for us to have shared a common ancestor who gave it to two of their children who passed it on to their descendants who DNA tested today.

On other segments, in red boxes, I share part of the same segments of DNA with Cheryl and Don, but someone along the line didn’t inherit all of that segment. For example on chromosome 3, in the red box, you can see that I share more with Cheryl (blue) than Don (red.)

In other cases, I share with either Don or Cheryl, but Don and Cheryl didn’t inherit that same segment of DNA from their father, so I don’t share with both of them. Those are the areas where you see only blue or only red.

On chromosome 12, you can see where it looks like Don’s and Cheryl’s segments butt up against each other. The DNA was clearly divided there. Don received one piece and Cheryl got the other. That’s known as a crossover and you can read about crossovers here, if you’d like.

It’s important to be able to view segment information to be able to see how others match in order to identify which common ancestor that DNA came from.

In Common With

You can use the “In Common With” tool to see who you match in common with any match. My first 6 matches in common with Cheryl are shown below. Note that they are already all bucketed to my maternal side.

First Steps Family Tree DNA in common with

click to enlarge

You can click on up to 7 individuals in the check box at left to show them on the chromosome browser at once to see if they match you on common segments.

Each matching segment has its own history and may descend from a different ancestor in your common tree.

First Steps 7 match chromosome browser

click to enlarge

If combinations of people do match me on a common segment, because these matches are all on my maternal side, they are triangulated and we know they have to descend from a common ancestor, assuming the segment is large enough. You can read about the concept of triangulation here. Triangulation occurs when 3 or more people (who aren’t extremely closely related like parents or siblings) all match each other on the same reasonably sized segment of DNA.

If you want to download your matches and work through this process in a spreadsheet, that’s an option too.

Size Matters

Small segments can be identical by chance instead of identical by descent.

  • “Identical by chance” means that you accidentally match someone because your DNA on that segment has been combined from both parents and causes it to match another person, making the segment “looks like” it comes from a common ancestor, when it really doesn’t. When DNA is sequenced, both your mother and father’s strands are sequenced, meaning that there’s no way to determine which came from whom. Think of a street with Mom’s side and Dad’s side with identical addresses on the houses on both sides. I wrote about that here.
  • “Identical by descent” means that the DNA is identical because it actually descends from a common ancestor. I discussed that concept in the article, We Match, But Are We Related.

Generally, we only utilize 7cM (centiMorgan) segments and above because at that level, about half of the segments are identical by descent and about half are identical by chance, known as false positives. By the time we move above 15 cM, most, but not all, matches are legitimate. You can read about segment size and accuracy here.

Using “In Common With” and the Matrix

“In Common With” is about who shares DNA. You can select someone you match to see who else you BOTH match. Just because you match two other people doesn’t necessarily mean that it’s on the same segment of DNA. In fact, you could match one person from your mother’s side and the other person from your father’s side.

First Steps match matrix.png

In this example, you match Person B due to ancestor John Doe and Person C due to ancestor Susie Smith. However, Person B also matches person C, but due to ancestor William West that they share and you don’t.

This example shows you THAT they match, but not HOW they match.

The only way to assure that the matches between the three people above are due to the same ancestor is to look at the segments with a chromosome browser and compare all 3 people to each other. Finding 3 people who match on the same segment, from the same side of your tree means that (assuming a reasonably large segment) you share a common ancestor.

Family Tree DNA has a nice matrix function that allows you to see which of your matches also match each other.

First steps matrix link

click to enlarge

The important distinction between the matrix and the chromosome browser is that the chromosome browser shows you where your matches match you, but those matches could be from both sides of your tree, unless they are bucketed. The matrix shows you if your matches also match each other, which is a huge clue that they are probably from the same side of your tree.

First Steps Family Tree DNA matrix.png

A matrix match is a significant clue in terms of who descends from which ancestors. For example, I know, based on who Amy matches, and who she doesn’t match, that she descends from the Ferverda side and that Charles, Rex and Maxine descend from ancestors on the Miller side.

Looking in the chromosome browser, I can tell that Cheryl, Don, Amy and I match on some common segments.

Matching multiple people on the same segment that descends from a common ancestor is called triangulation.

Let’s take a look at the MyHeritage triangulation tool.

MyHeritage

Moving now to MyHeritage who provides us with an easy to use triangulation tool, we see the following when clicking on DNA matches on the DNA tab on the toolbar.

First Steps MyHeritage matches

click to enlarge

Cousin Cheryl is at MyHeritage too. By clicking on Review DNA Match, the purple button on the right, I can see who else I match in common with Cheryl, plus triangulation.

The list of people Cheryl and I both match is shown below, along with our relationships to each person.

First Steps MyHeritage triangulation

click to enlarge

I’ve selected 2 matches to illustrate.

The first match has a little purple icon to the right which means that Amy triangulates with me and Cheryl.

The second match, Rex, means that while we both match Rex, it’s not on the same segment. I know that without looking further because there is no triangulation button. We both match Rex, but Cheryl matches Rex on a different segment than I do.

Without additional genealogy work, using DNA alone, I can’t say whether or not Cheryl, Rex and I all share a common ancestor. As it turns out, we do. Rex is a known cousin who I tested. However, in an unknown situation, I would have to view the trees of those matches to make that determination.

Triangulation

Clicking on the purple triangulation icon for Amy shows me the segments that all 3 of us, me, Amy and Cheryl share in common as compared to me.

First Steps MyHeritage triangulation chromosome browser.png

Cheryl is red and Amy is yellow. The one segment bracketed with the rounded rectangle is the segment shared by all 3 of us.

Do we have a common ancestor? I know Cheryl and I do, but maybe I don’t know who Amy is. Let’s look at Amy’s tree which is also shown if I scroll down.

First Steps MyHeritage common ancestor.png

Amy didn’t have her tree built out far enough to show our common ancestor, but I immediately recognized the surname Ferveda found in her tree a couple of generations back. Darlene was the daughter of Donald Ferverda who was the son of Hiram Ferverda, my great-grandfather.

Hiram was the father of Cheryl’s father, Roscoe and my grandfather, John Ferverda.

First Steps Hiram Ferverda pedigree.png

Amy is my first cousin twice removed and that segment of DNA that I share with her is from either Hiram Ferverda or his wife Eva Miller.

Now, based on who else Amy matches, I can probably tell whether that segment descends from Hiram or Eva.

Viva triangulation!

Theory of Family Relativity

MyHeritage’s Theory of Family Relativity provides theories to people whose DNA matches regarding their common ancestor if MyHeritage can calculate how the 2 people are potentially related.

MyHeritage uses a combination of tools to make that connection, including:

  • DNA matches
  • Your tree
  • Your match’s tree
  • Other people’s trees at MyHeritage, FamilySearch and Geni if the common ancestor cannot be found in your tree compared against your DNA match’s MyHeritage
  • Documents in the MyHeritage data collection, such as census records, for example.

MyHeritage theory update

To view the Theories, click on the purple “View Theories” banner or “View theory” under the DNA match.

First Steps MyHeritage theory of relativity

click to enleage

The theory is displayed in summary format first.

MyHeritage view full theory

click to enlarge

You can click on the “View Full Theory” to see the detail and sources about how MyHeritage calculated various paths. I have up to 5 different theories that utilize separate resources.

MyHeritage review match

click to enlarge

A wonderful aspect of this feature is that MyHeritage shows you exactly the information they utilized and calculates a confidence factor as well.

All theories should be viewed as exactly that and should be evaluated critically for accuracy, taking into consideration sources and documentation.

I wrote about using Theories of Relativity, with instructions, here and here.

I love this tool and find the Theories mostly accurate.

AncestryDNA

Ancestry doesn’t offer a chromosome browser or triangulation but does offer a tree view for people that you match, so long as you have a subscription. In the past, a special “Light” subscription for DNA only was available for approximately $49 per year that provided access to the trees of your DNA matches and other DNA-related features. You could not order online and had to call support, sometimes asking for a supervisor in order to purchase that reduced-cost subscription. The “Light” subscription did not provide access to anything outside of DNA results, meaning documents, etc. I don’t know if this is still available.

After signing on, click on DNA matches on the DNA tab on the toolbar.

You’ll see the following match list.

First Steps Ancestry matches

click to enlarge

I’ve tested twice at Ancestry, the second time when they moved to their new chip, so I’m my own highest match. Click on any match name to view more.

First Steps Ancestry shared matches

click to enlarge

You’ll see information about common ancestors if you have some in your trees, plus the amount of shared DNA along with a link to Shared Matches.

I found one of the same cousins at Ancestry whose match we were viewing at MyHeritage, so let’s see what her match to me at Ancestry looks like.

Below are my shared matches with that cousin. The notes to the right are mine, not provided by Ancestry. I make extensive use of the notes fields provided by the vendors.

First Steps Ancestry shared matches with cousin

click to enlarge

On your match list, you can click on any match, then on Shared Matches to see who you both match in common. While Ancestry provides no chromosome browser, you can see the amount of DNA that you share and trees, if any exist.

Let’s look at a tree comparison when a common ancestor can be detected in a tree within the past 7 generations.

First Steps Ancestry view ThruLines.png

What’s missing of course is that I can’t see how we match because there’s no chromosome browser, nor can I see if my matches match each other.

Stitched Trees

What I can see, if I click on “View ThruLines” above or ThruLines on the DNA Summary page on the main DNA tab is all of the people I match who Ancestry THINKS we descend from a common ancestor. This ancestor information isn’t always taken from either person’s tree.

For example, if my match hadn’t included Hiram Ferverda in her tree, Ancestry would use other people’s trees to “stitch them together” such that the tester is shown to be descended from a common ancestor with me. Sometimes these stitched trees are accurate and sometimes they are not, although they have improved since they were first released. I wrote about ThruLines here.

First Steps Ancestry ThruLines tree

click to enlarge

In closer generations, especially if you are looking to connect with cousins, tree matching is a very valuable tool. In the graphic above, you can see all of the cousins who descend from Hiram Ferverda who have tested and DNA match to me. These DNA matches to me either descend from Hiram according to their trees, or Ancestry believes they descend from Hiram based on other people’s trees.

With more distant ancestors, other people’s trees are increasingly likely to be copied with no sources, so take them with a very large grain of salt (perchance the entire salt lick.) I use ThruLines as hints, not gospel, especially the further back in time the common ancestor. I wish they reached back another couple of generations. They are great hints and they end with the 7th generation where my brick walls tend to begin!

23andMe

I haven’t mentioned 23andMe yet in this article. Genealogists do test there, especially adoptees who need to fish in every pond.

23andMe is often the 4th choice of the major 4 vendors for genealogy due to the following challenges:

  • No tree support, other than allowing you to link to a tree at FamilySearch or elsewhere. This means no tree matching.
  • Less than 2000 matches, meaning that every person is limited to a maximum of 2000 matches, minus however many of those 2000 don’t opt-in for genealogical matching. Given that 23andMe’s focus is increasingly health, my number of matches continues to decrease and is currently just over 1500. The good news is that those 1500 are my highest, meaning closest matches. The bad news is the genealogy is not 23andMe’s focus.

If you are an adoptee, a die-hard genealogist or specifically interested in ethnicity, then test at 23andMe. Otherwise all three of the other vendors would be better choices.

However, like the other vendors, 23andMe does have some features that are unique.

Their ethnicity predictions are acknowledged to be excellent. Ethnicity at 23andMe is called Ancestry Composition, and you’ll see that immediately when you sign in to your account.

First Steps 23andMe DNA Relatives.png

Your matches at 23andMe are found under DNA Relatives.

First Steps 23andMe tools

click to enlarge

At left, you’ll find filters and the search box.

Mom’s and Dad’s side filter matches if you’ve tested your parents, but it’s not like the Family Tree DNA bucketing that provides maternal and paternal side bucketing by utilizing through third cousins if your parents aren’t available for testing.

Family names aren’t your family names, but the top family names that match to you. Guess what my highest name is? Smith.

However, Ancestor Birthplaces are quite useful because you can sort by country. For example, my mother’s grandfather Ferverda was born in the Netherlands.

First Steps 23andMe country.png

If I click on Netherlands, I can see my 5 matches with ancestors born in the Netherlands. Of course, this doesn’t mean that I match because of my match’s Dutch ancestors, but it does provide me with a place to look for a common ancestor and I can proceed by seeing who I match in common with those matches. Unfortunately, without trees we’re left to rely on ancestor birthplaces and family surnames, if my matches have entered that information.

One of my Dutch matches also matches my Ferverda cousin. Given that connection, and that the Ferverda family immigrated from Holland in 1868, that’s a starting point.

MyHeritage has a similar features and they are much more prevalent in Europe.

By clicking on my Ferverda cousin, I can view the DNA we share, who we match in common, our common ethnicity and more. I have the option of comparing multiple people in the chromosome browser by clicking on “View DNA Comparison” and then selecting who I wish to compare.

First Steps 23andMe view DNA Comparison.png

By scrolling down instead of clicking on View DNA Comparison, I can view where my Ferverda cousin matches me on my chromosomes, shown below.

First STeps 23andMe chromosome browser.png

23andMe identifies completely identical segments which would be painted in dark purple, the legend at bottom left.

Adoptees love this feature because it would immediately differentiate between half and full siblings. Full siblings share approximately 25% of the exact DNA on both their maternal and paternal strands of DNA, while half siblings only share the DNA from one parent – assuming their parents aren’t closely related. I share no completely identical DNA with my Ferverda cousin, so no segments are painted dark purple.

23andMe and Ancestry Maps Show Where Your Matches Live

Another reason that adoptees and people searching for birth parents or unknown relatives like 23andMe is because of the map function.

After clicking on DNA Relatives, click on the Map function at the top of the page which displays the following map.

First Steps 23andMe map

click to enlarge

This isn’t a map of where your matches ancestors lived, but is where your matches THEMSELVES live. Furthermore, you can zoom in, click on the button and it displays the name of the individual and the city where they live or whatever they entered in the location field.

First Steps 23andMe your location on map.png

I entered a location in my profile and confirmed that the location indeed displays on my match’s maps by signing on to another family member’s account. What I saw is the display above. I’d wager that most testers don’t realize that their home location and photo, if entered, is being displayed to their matches.

I think sharing my ancestors’ locations is a wonderful, helpful, idea, but there is absolutely no reason whatsoever for anyone to know where I live and I feel it’s stalker-creepy and a safety risk.

First Steps 23andMe questions.png

If you enter a location in this field in your profile, it displays on the map.

If you test with 23andMe and you don’t want your location to display on this map to your matches, don’t answer any question that asks you where you call home or anything similar. I never answer any questions at 23andMe. They are known for asking you the same question repeatedly, in multiple locations and ways, until you relent and answer.

Ancestry has a similar map feature and they’ve also begun to ask you questions that are unrelated to genealogy.

Ancestry Map Shows Where Your Matches Live

At Ancestry, when you click to see your DNA matches, look to the right at the map link.

First Steps Ancestry map link.png

By clicking on this link, you can see the locations that people have entered into their profile.

First Steps Ancestry match map.png

As you can see, above, I don’t have a location entered and I am prompted for one. Note that Ancestry does specifically say that this location will be shown to your matches.

You can click on the Ancestry Profile link here, or go to your Personal Profile by click the dropdown under your user name in the upper right hand corner of any page.

This is important because if you DON’T want your location to show, you need to be sure there is nothing entered in the location field.

First Steps Ancestry profile.png

Under your profile, click “Edit.”

First Steps Ancestry edit profile.png

After clicking edit, complete the information you wish to have public or remove the information you do not.

First Steps Ancestry location in profile.png

Sometimes Your Answer is a Little More Complicated

This is a First Steps article. Sometimes the answer you seek might be a little more complicated. That’s why there are specialists who deal with this all day, everyday.

What issues might be more complex?

If you’re just starting out, don’t worry about these things for now. Just know when you run into something more complex or that doesn’t make sense, I’m here and so are others. Here’s a link to my Help page.

Getting Started

What do you need to get started?

  • You need to take a DNA test, or more specifically, multiple DNA tests. You can test at Ancestry or 23andMe and transfer your results to both Family Tree DNA and MyHeritage, or you can test directly at all vendors.

Neither Ancestry nor 23andMe accept uploads, meaning other vendors tests, but both MyHeritage and Family Tree DNA accept most file versions. Instructions for how to download and upload your DNA results are found below, by vendor:

Both MyHeritage and Family Tree DNA charge a minimal fee to unlock their advanced features such as chromosome browsers and ethnicity if you upload transfer files, but it’s less costly in both cases than testing directly. However, if you want the MyHeritage DNA plus Health or the Family Tree DNA Y DNA or Mitochondrial DNA tests, you must test directly at those companies for those tests.

  • It’s not required, but it would be in your best interest to build as much of a tree at all three vendors as you can. Every little bit helps.

Your first tree-building step should be to record what your family knows about your grandparents and great-grandparents, aunts and uncles. Here’s what my first step attempt looked like. It’s cringe-worthy now, but everyone has to start someplace. Just do it!

You can build a tree at either Ancestry or MyHeritage and download your tree for uploading at the other vendors. Or, you can build the tree using genealogy software on your computer and upload to all 3 places. I maintain my primary tree on my computer using RootsMagic. There are many options. MyHeritage even provides free tree builder software.

Both Ancestry and MyHeritage offer research/data subscriptions that provide you with hints to historical documents that increase what you know about your ancestors. The MyHeritage subscription can be tried for free. I have full subscriptions to both Ancestry and MyHeritage because they both include documents in their collections that the other does not.

Please be aware that document suggestions are hints and each one needs to be evaluated in the context of what you know and what’s reasonable. For example, if your ancestor was born in 1750, they are not included in the 1900 census, nor do women have children at age 70. People do have exactly the same names. FindAGrave information is entered by humans and is not always accurate. Just sayin’…

Evaluate critically and skeptically.

Ok, Let’s Go!

When your DNA results are ready, sign on to each vendor, look at your matches and use this article to begin to feel your way around. It’s exciting and the promise is immense. Feel free to share the link to this article on social media or with anyone else who might need help.

You are the cumulative product of your ancestors. What better way to get to know them than through their DNA that’s shared between you and your cousins!

What can you discover today?

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

When DNA Leads You Astray

I’m currently going through what I refer to as “the great purge.”

This occurs when you can’t stand the accumulated piles and boxes of “stuff” and the file drawers are full, so you set about throwing away and giving away. (Yes, I know you just cringed. Me too.)

The great news is that I’ve run across so much old (as in decades old) genealogy from when I first began this journey. I used to make lists of questions and a research “to do” list. I was much more organized then, but there were also fewer “squirrel moments” available online to distract me with “look here, no, over here, no, wait….”

Most of those questions on my old genealogy research lists have (thankfully) since been answered, slowly, one tiny piece of evidence at a time. Believe me, that feeling is very rewarding and while on a daily basis we may not think we’re making much progress; in the big picture – we’re slaying that dragon!

However, genealogy is also fraught with landmines. If I had NOT found the documentation before the days of DNA testing, I could easily have been led astray.

“What?”, you ask, but “DNA doesn’t lie.” No, it doesn’t, but it will sure let you kid yourself about some things.

DNA is a joker and has no problem allowing you to fool yourself and by virtue of that, others as well.

Joke’s On Me

Decades ago, Aunt Margaret told me that her grandmother’s mother was “a Rosenbalm from up on the Lee County (VA) border.”

Now, at that time, I had absolutely NO reason to doubt what she said. After all, it’s her grandmother, Margaret Claxton/Clarkson who she knew personally, who didn’t pass away until my aunt was in her teens. Plenty close enough to know who Margaret Claxton’s mother was. Right?

DNA Astray Rosenbalm

Erroneous pedigree chart. Rebecca Rosenbalm is NOT the mother of Elizabeth Claxton/Clarkson.

I filled Rebecca Rosenbalm’s name into the appropriate space on my pedigree chart, was happy and smugly smiling like a Cheshire cat, right up until I accidentally discovered that the information was just plain wrong.

Uh oh….

Time Rolls On

As records became increasingly available, both in transcribed fashion and online, Hancock County, TN death certificates eventually could be obtained, one way or another. Being a dutiful genealogist, I collected all relevant documents for my ancestors, contentedly filing them in the “well that’s done” category – that is right up until Margaret Clarkson Bolton’s death certificate stopped me dead in my tracks.

margaret clarkson bolton death

Oops

Margaret’s mother wasn’t listed as Rebecca Rosenbalm, nor Rebecca anyone. She was listed as Betsy Speaks. Or was it Spears? In our family, Betsy is short for Elizabeth.

Who the heck was Elizabeth Speaks, or Spears. This was one fine monkey wrench!

A trip to Hancock County, Tennessee was in order.

I dug through dusty deed and court records, sifted through the archives in basements and the old jail building where I just KNEW my ancestors had inhabited cells at one time or another.

Yes, my ancestor’s records really were in jail!

Records revealed that the woman in question was Elizabeth Speaks, not Spears, although the Spears family did live in the area and had “married in” to many local families. Nothing is ever simple and our ancestors do have a perverse sense of humor.

Elizabeth Speak(s) was the daughter of Charles Speak, and the Speak family lived a few miles across the border into Lee County, Virginia. This high mountain land borders two states and three counties, so records are scattered among them – not to mention two fires in the Hancock County courthouse make research challenging.

Why?

I asked my Aunt Margaret who was still living at the time about this apparent discrepancy and she told me that the Rosenbalms “up in Rose Hill, Virginia” told her that her grandmother, Margaret Claxton/Clarkson was kin to them, so Margaret had assumed (there’s that word again) that Margaret Claxton’s mother was their Rebecca Rosenbalm.

Wrong!

The Kernel of Truth

Like so many family stories, there is a kernel of truth, surrounded by a multitude errors. Distilling the grain of truth is the challenge of course.

Margaret Claxton’s mother was Elizabeth (Betsy) Speak and her father was Charles Speak. Charles Speak’s sister, Rebecca married William Henderson Rosenbalm in 1854, had 4 children and died in February 1859. So there indeed was a woman named Rebecca (Speaks) Rosenbalm who had died young and wasn’t well known.

Rebecca’s sister Frances “Fanny” Speak also married that same William Henderson Rosenbalm in November 1859, a few months after Rebecca had died. Fannie also had 4 children, one of which was also named Rebecca Rosenbalm. Do you see a trend here?

So, indeed there were 7 living Rosenbalm children who were first cousins to Elizabeth Speak who married Samuel Claxton and lived a dozen miles away, over the mountains and across the Powell River. Now a dozen miles might not sound like much today, but in the mountains during horse and wagon days – 10 miles wasn’t trivial and required a multi-day commitment for a visit. In other words, the next generation of the family knew of their cousins but didn’t know them well.

The following generation included my Aunt Margaret who was told by those cousins that she was related to them through the Rosenbalm family. While, that was true for the Rosenbalm cousins, it was not true for Aunt Margaret who was related to the Rosenbalms through their common Speak ancestor.

Here’s what the family tree really looks like, only showing the lines under discussion.

DNA astray correct pedigree

You can see why Aunt Margaret might not know specifics. She was actually several generations removed from the common ancestor. She knew THAT they were related, but not HOW they were related and there were several Rebecca’s in several branches of the family.

Why Does This Matter?

You’ve probably guessed by now that someplace in here, there’s a moral to this story, so here it is!

You may have already surmised that I have autosomal DNA matches to cousins through the Rosenbalm/Speaks line.

DNA astray pedigree match

This is one example, but there are more, some being double cousins meaning two of Nicholas Speak’s 11 children’s descendants have intermarried. Life is a lot more complex in those hills and hollers than people think – and unraveling the relationships, both paper and genetic (which are sometimes two different things) is challenging.

DNA astray chromosome 10.png

I match this fourth cousin once removed (4C1R) on a healthy 18 cM segment on chromosome 10.

Wrong Conclusions

Now, think back to where I was originally in my research. I knew that Margaret Claxton/Clarkson was my aunt’s grandmother. I knew nothing at all about the Speak family and had never heard that surname.

Had I ONLY been looking to confirm the Rosenbalm connection, I certainly would have confirmed that I’m related to the Rosenbalm family descendants with this match. Except the conclusion that I descend from a Rosenbalm ancestor would have been WRONG. What we share are the Speak ancestors.

So really, the DNA didn’t lie, but unless I dissected what the DNA match was really telling me carefully and methodically with NO PRECONCEIVED NOTIONS, I would have “confirmed” erroneous information. Or, at least I would have thought that I confirmed it.

I would actually have been doing something worse meaning convincing myself of “facts” that weren’t accurate, which means I would have then been spreading around those cancerous bad trees. Guaranteed, I do NOT want to be that person.

Foolers

I can tell you here and now that I have found several matches that were foolers because I share multiple ancestors with a person that I match, even if those multiple ancestors aren’t known to either or both of us. Every single DNA segment has its own unique history. I match one individual on two segments, one segment through my mom and one segment through my dad. Fortunately, we’ve identified both ancestors now, but imaging my initial surprise and confusion, especially given that my parents don’t share any common ancestors, communities or locations.

We have to evaluate all of the evidence to confirm that the conclusion being drawn in accurate.

DNA astray painting

One of the sanity checks I use, in addition to triangulation, is to paint my matches with known ancestors on my chromosomes using DNAPainter. Here’s the match to my cousin, and it overlaps with other people who share the same ancestor couple. Several matches are obscured behind the black box. If I discover someone that I supposedly match from a different ancestor couple sharing this segment of my father’s DNA, that’s a red neon flashing sign that something is wrong and I need to figure out what and why.

Ignoring this problem and hoping it will go away doesn’t work. I’ve tried😊

Three possible things can be wrong:

  1. The segment is identical by chance, not by descent. With a segment of 18 cM, that’s extremely unlikely. Triangulation with other people on this same segment on the same parent’s side should eliminate most false matches over 7cM. The larger the match, the more likely it is NOT identical by chance, meaning that it IS identical by descent or genealogically relevant.
  2. The segment is accurately matched but the genealogy is confused – such as my Rosenbalm example. This can happen with multiple ancestors, or descent from the same family but through an unknown connection. Looking for other connections to this family and sorting through matches’ trees often provides hints that resolve this situation. In my case, I might have noticed that I matched other people who descended from Nicholas Speak, which would not have been the case had I descended through the Rosenbalm family.
  3. The third scenarios is that the genealogy is plain flat out wrong. Yea, I know this one hurts. Get the saw ready.

The Devil in the Details

Always evaluate your matches in light of what you don’t know, not in order to confirm what you think you know. Play the devil’s advocate – all the time. After all, the devil really is in the details.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some (but not all) of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Elizabeth Warren’s Native American DNA Results: What They Mean

Elizabeth Warren has released DNA testing results after being publicly challenged and derided as “Pochahontas” as a result of her claims of a family story indicating that her ancestors were Native America. If you’d like to read the specifics of the broo-haha, this Washington Post Article provides a good summary, along with additional links.

I personally find name-calling of any type unacceptable behavior, especially in a public forum, and while Elizabeth’s DNA test was taken, I presume, in an effort to settle the question and end the name-calling, what it has done is to put the science of genetic testing smack dab in the middle of the headlines.

This article is NOT about politics, it’s about science and DNA testing. I will tell you right up front that any comments that are political or hateful in nature will not be allowed to post, regardless of whether I agree with them or not. Unfortunately, these results are being interpreted in a variety of ways by different individuals, in some cases to support a particular political position. I’m presenting the science, without the politics.

This is the first of a series of two articles.

I’m dividing this first article into four sections, and I’d ask you to read all four, especially before commenting. A second article, Possibilities – Wringing the Most Out of Your DNA Ethnicity Test will follow shortly about how to get the most out of an ethnicity test when hunting for Native American (or other minority, for you) ethnicity.

Understanding how the science evolved and works is an important factor of comprehending the results and what they actually mean, especially since Elizabeth’s are presented in a different format than we are used to seeing. What a wonderful teaching opportunity.

  • Family History and DNA Science – How this works.
  • Elizabeth Warren’s Genealogy
  • Elizabeth Warren’s DNA Results
  • Questions and Answers – These are the questions I’m seeing, and my science-based answers.

My second article, Possibilities – Wringing the Most Out of Your DNA Ethnicity Test will include:

  • Potential – This isn’t all that can be done with ethnicity results. What more can you do to identify that Native ancestor?
  • Resources with Step by Step Instructions

Now, let’s look at Elizabeth’s results and how we got to this point.

Family Stories and DNA

Every person that grows up in their biological family hears family stories. We have no reason NOT to believe them until we learn something that potentially conflicts with the facts as represented in the story.

In terms of stories handed down for generations, all we have to go on, initially, are the stories themselves and our confidence in the person relating the story to us. The day that we begin to suspect that something might be amiss, we start digging, and for some people, that digging begins with a DNA test for ethnicity.

My family had that same Cherokee story. My great-grandmother on my father’s side who died in 1918 was reportedly “full blooded Cherokee” 60 years later when I discovered she had existed. Her brothers reportedly went to Oklahoma to claim headrights land. There were surely nuggets of truth in that narrative. Family members did indeed to go Oklahoma. One did own Cherokee land, BUT, he purchased that land from a tribal member who received an allotment. I discovered that tidbit later.

What wasn’t true? My great-grandmother was not 100% Cherokee. To the best of my knowledge now, a century after her death, she wasn’t Cherokee at all. She probably wasn’t Native at all. Why, then, did that story trickle down to my generation?

I surely don’t know. I can speculate that it might have been because various people were claiming Native ancestry in order to claim land when the government paid tribal members for land as reservations were dissolved between 1893 and 1914. You can read more about that in this article at the National Archives about the Dawes Rolls, compiled for the Cherokee, Creek, Choctaw, Chickasaw and Seminole for that purpose.

I can also speculate that someone in the family was confused about the brother’s land ownership, especially since it was Cherokee land.

I could also speculate that the confusion might have resulted because her husband’s father actually did move to Oklahoma and lived on Choctaw land.

But here is what I do know. I believed that story because there wasn’t any reason NOT to believe it, and the entire family shared the same story. We all believed it…until we discovered evidence through DNA testing that contradicted the story.

Before we discuss Elizabeth Warren’s actual results, let’s take a brief look at the underlying science.

Enter DNA Testing

DNA testing for ethnicity was first introduced in a very rudimentary form in 2002 (not a typo) and has progressed exponentially since. The major vendors who offer tests that provide their customers with ethnicity estimates (please note the word estimates) have all refined their customer’s results several times. The reference populations improve, the vendor’s internal software algorithms improve and population genetics as a science moves forward with new discoveries.

Note that major vendors in this context mean Family Tree DNA, 23andMe, the Genographic Project and Ancestry. Two newer vendors include MyHeritage and LivingDNA although LivingDNA is focused on England and MyHeritage, who utilizes imputation is not yet quite up to snuff on their ethnicity estimates. Another entity, GedMatch isn’t a testing vendor, but does provide multiple ethnicity tools if you upload your results from the other vendors. To get an idea of how widely the results vary, you can see the results of my tests at the different vendors here and here.

My initial DNA ethnicity test, in 2002, reported that I was 25% Native American, but I’m clearly not. It’s evident to me now, but it wasn’t then. That early ethnicity test was the dinosaur ages in genetic genealogy, but it did send me on a quest through genealogical records to prove that my family member was indeed Native. My father clearly believed this, as did the rest of the family. One of my early memories when I was about four years old was attending a (then illegal) powwow with my Dad.

In order to prove that Elizabeth Vannoy, that great-grandmother, was Native I asked a cousin who descends from her matrilineally to take a mitochondrial DNA test that would unquestionably provide the ethnicity of her matrilineal line – that of her mother’s mother’s mother’s direct line. If she was Native, her haplogroup would be a derivative either A, B, C, D or X. Her mitochondrial DNA was European, haplogroup J, clearly not Native, so Elizabeth Vannoy was not Native on that line of her family. Ok, maybe through her dad’s line then. I was able to find a Vanoy male descendant of her father, Joel Vannoy, to test his Y DNA and he was not Native either. Rats!

Tracking Elizabeth Vannoy’s genealogy back in time provided no paper-trail link to any Native ancestors, but there were and are still females whose surnames and heritage we don’t know. Were they Native or part Native? Possibly. Nothing precludes it, but nothing (yet) confirms it either.

Unexpected Results

DNA testing is notorious for unveiling unexpected results. Adoptions, unknown parents, unexpected ethnicities, previously unknown siblings and half-siblings and more.

Ethnicity is often surprising and sometimes disappointing. People who expect Native American heritage in their DNA sometimes don’t find it. Why?

  • There is no Native ancestor
  • The Native DNA has “washed out” over the generations, but they did have a Native ancestor
  • We haven’t yet learned to recognize all of the segments that are Native
  • The testing company did not test the area that is Native

Not all vendors test the same areas of our DNA. Each major company tests about 700,000 locations, roughly, but not the same 700,000. If you’re interested in specifics, you can read more about that here.

50-50 Chance

Everyone receives half of their autosomal DNA from each parent.

That means that each parent contributes only HALF OF THEIR DNA to a child. The other half of their DNA is never passed on, at least not to that child.

Therefore, ancestral DNA passed on is literally cut in half in each generation. If your parent has a Native American DNA segment, there is a 50-50 chance you’ll inherit it too. You could inherit the entire segment, a portion of the segment, or none of the segment at all.

That means that if you have a Native ancestor 6 generations back in your tree, you share 1.56% of their DNA, on average. I wrote the article, Ancestral DNA Percentages – How Much of Them is in You? to explain how this works.

These calculations are estimates and use averages. Why? Because they tell us what to expect, on average. Every person’s results will vary. It’s entirely possible to carry a Native (or other ethnic) segment from 7 or 8 or 9 generations ago, or to have none in 5 generations. Of course, these calculations also presume that the “Native” ancestor we find in our tree was fully Native. If the Native ancestor was already admixed, then the percentages of Native DNA that you could inherit drop further.

Why Call Ethnicity an Estimate?

You’ve probably figured out by now that due to the way that DNA is inherited, your ethnicity as reported by the major testing companies isn’t an exact science. I discussed the methodology behind ethnicity results in the article, Ethnicity Testing – A Conundrum.

It is, however, a specialized science known as Population Genetics. The quality of the results that are returned to you varies based on several factors:

  • World Region – Ethnicity estimates are quite accurate at the continental level, plus Jewish – meaning African, Indo-European, Asian, Native American and Jewish. These regions are more different than alike and better able to be separated.
  • Reference Population – The size of the population your results are being compared to is important. The larger the reference population, the more likely your results are to be accurate.
  • Vendor Algorithm – None of the vendors provide the exact nature of their internal algorithms that they use to determine your ethnicity percentages. Suffice it to say that each vendor’s staff includes population geneticists and they all have years of experience. These internal differences are why the estimates vary when compared to each other.
  • Size of the Segment – As with all genetic genealogy, bigger is better because larger segments stand a better chance of being accurate.
  • Academic Phasing – A methodology academics and vendors use in which segments of DNA that are known to travel together during inheritance are grouped together in your results. This methodology is not infallible, but in general, it helps to group your mother’s DNA together and your father’s DNA together, especially when parents are not available for testing.
  • Parental Phasing – If your parents test and they too have the same segment identified as Native, you know that the identification of that segment as Native is NOT a factor of chance, where the DNA of each of your parents just happens to fall together in a manner as to mimic a Native segment. Parental phasing is the ability to divide your DNA into two parts based on your parent’s DNA test(s).
  • Two Chromosomes – You have two chromosomes, one from your mother and one from your father. DNA testing can’t easily separate those chromosomes, so the exact same “address” on your mother’s and father’s chromosomes that you inherited may carry two different ethnicities. Unless your parents are both from the same ethnic population, of course.

All of these factors, together, create a confidence score. Consumers never see these scores as such, but the vendors return the highest confidence results to their customers. Some vendors include the capability, one way or another, to view or omit lower confidence results.

Parental Phasing – Identical by Descent

If you’re lucky enough to have your parents, or even one parent available to test, you can determine whether that segment thought to be Native came from one of your parents, or if the combination of both of your parent’s DNA just happened to combine to “look” Native.

Here’s an example where the “letters” (nucleotides) of Native DNA for an example segment are shown at left. If you received the As from one of your parents, your DNA is said to be phased to that parent’s DNA. That means that you in fact inherited that piece of your DNA from your mother, in the case shown below.

That’s known as Identical by Descent (IBD). The other possibility is what your DNA from both of your parents intermixed to mimic a Native segment, shown below.

This is known as Identical by Chance (IBC).

You don’t need to understand the underpinnings of this phenomenon, just remember that it can happen, and the smaller the segment, the more likely that a chance combination can randomly happen.

Elizabeth Warren’s Genealogy

Elizabeth Warren’s genealogy, is reported to the 5th generation by WikiTree.

Elizabeth’s mother, Pauline Herring’s line is shown, at WikiTree, as follows:

Notice that of Elizabeth Warren’s 16 great-great-great grandparents on her mother’s side, 9 are missing.

Paper trail being unfruitful, Elizabeth Warren, like so many, sought to validate her family story through DNA testing.

Elizabeth Warren’s DNA Results

Elizabeth Warren didn’t test with one of the major vendors. Instead, she went directly to a specialist. That’s the equivalent of skipping the family practice doctor and going to the Mayo Clinic.

Elizabeth Warren had test results interpreted by Dr. Carlos Bustamante at Stanford University. You can read the actual report here and I encourage you to do so.

From the report, here are Dr. Bustamante’s credentials:

Dr. Carlos D. Bustamante is an internationally recognized leader in the application of data science and genomics technology to problems in medicine, agriculture, and biology. He received his Ph.D. in Biology and MS in Statistics from Harvard University (2001), was on the faculty at Cornell University (2002-9), and was named a MacArthur Fellow in 2010. He is currently Professor of Biomedical Data Science, Genetics, and (by courtesy) Biology at Stanford University. Dr. Bustamante has a passion for building new academic units, non-profits, and companies to solve pressing scientific challenges. He is Founding Director of the Stanford Center for Computational, Evolutionary, and Human Genomics (CEHG) and Inaugural Chair of the Department of Biomedical Data Science. He is the Owner and President of CDB Consulting, LTD. and also a Director at Eden Roc Biotech, founder of Arc-Bio (formerly IdentifyGenomics and BigData Bio), and an SAB member of Imprimed, Etalon DX, and Digitalis Ventures among others.

He’s no lightweight in the study of Native American DNA. This 2012 paper, published in PLOS Genetics, Development of a Panel of Genome-Wide Ancestry Informative Markers to Study Admixture Throughout the Americas focused on teasing out Native American markers in admixed individuals.

From that paper:

Ancestry Informative Markers (AIMs) are commonly used to estimate overall admixture proportions efficiently and inexpensively. AIMs are polymorphisms that exhibit large allele frequency differences between populations and can be used to infer individuals’ geographic origins.

And:

Using a panel of AIMs distributed throughout the genome, it is possible to estimate the relative ancestral proportions in admixed individuals such as African Americans and Latin Americans, as well as to infer the time since the admixture process.

The methodology produced results of the type that we are used to seeing in terms of continental admixture, shown in the graphic below from the paper.

Matching test takers against the genetic locations that can be identified as either Native or African or European informs us that our own ancestors carried the DNA associated with that ethnicity.

Of course, the Native samples from this paper were focused south of the United States, but the process is the same regardless. The original Native American population of a few individuals arrived thousands of years ago in one or more groups from Asia and their descendants spread throughout both North and South America.

Elizabeth’s request, from the report:

To analyze genetic data from an individual of European descent and determine if there is reliable evidence of Native American and/or African ancestry. The identity of the sample donor, Elizabeth Warren, was not known to the analyst during the time the work was performed.

Elizabeth’s test included 764,958 genetic locations, of which 660,173 overlapped with locations used in ancestry analysis.

The Results section says after stating that Elizabeth’s DNA is primarily (95% or greater) European:

The analysis also identified 5 genetic segments as Native American in origin at high confidence, defined at the 99% posterior probability value. We performed several additional analyses to confirm the presence of Native American ancestry and to estimate the position of the ancestor in the individual’s pedigree.

The largest segment identified as having Native American ancestry is on chromosome 10. This segment is 13.4 centiMorgans in genetic length, and spans approximately 4,700,000 DNA bases. Based on a principal components analysis (Novembre et al., 2008), this segment is clearly distinct from segments of European ancestry (nominal p-value 7.4 x 10-7, corrected p-value of 2.6 x 10-4) and is strongly associated with Native American ancestry.

The total length of the 5 genetic segments identified as having Native American ancestry is 25.6 centiMorgans, and they span approximately 12,300,000 DNA bases. The average segment length is 5.8 centiMorgans. The total and average segment size suggest (via the method of moments) an unadmixed Native American ancestor in the pedigree at approximately 8 generations before the sample, although the actual number could be somewhat lower or higher (Gravel, 2012 and Huff et al., 2011).

Dr. Bustamante’s Conclusion:

While the vast majority of the individual’s ancestry is European, the results strongly support the existence of an unadmixed Native American ancestor in the individual’s pedigree, likely in the range of 6-10 generations ago.

I was very pleased to see that Dr. Bustamante had included the PCA (Principal Component Analysis) for Elizabeth’s sample as well.

PCA analysis is the scientific methodology utilized to group individuals to and within populations.

Figure one shows the section of chromosome 10 that showed the largest Native American haplotype, meaning DNA block, as compared to other populations.

Remember that since Elizabeth received a chromosome from BOTH parents, that she has two strands of DNA in that location.

Here’s our example again.

Given that Mom’s DNA is Native, and Dad’s is European in this example, the expected results when comparing this segment of DNA to other populations is that it would look half Native (Mom’s strand) and half European (Dad’s strand.)

The second graphic shows Elizabeth’s sample and where it falls in the comparison of First Nations (Canada) and Indigenous Mexican individuals. Given that Elizabeth’s Native ancestor would have been from the United States, her sample falls where expected, inbetween.

Let’s take a look at some of the questions being asked.

Questions and Answers

I’ve seen a lot of misconceptions and questions regarding these results. Let’s take them one by one:

Question – Can these results prove that Elizabeth is Cherokee?

Answer – No, there is no test, anyplace, from any lab or vendor, that can prove what tribe your ancestors were from. I wrote an article titled Finding Your American Indian Tribe Using DNA, but that process involves working with your matches, Y and mitochondrial DNA testing, and genealogy.

Q – Are these results absolutely positive?

A – The words “absolutely positive” are a difficult quantifier. Given the size of the largest segment, 13.4 cM, and that there are 5 Native segments totaling 25.6 cM, and that Dr. Bustamante’s lab performed the analysis – I’d say this is as close to “absolutely positive” as you can get without genealogical confirmation.

A 13.4 cM segment is a valid segment that phases to parents 98% of the time, according to Philip Gammon’s work, here, and 99% of the time in my own analysis here. That indicates that a 13.4 cM segment is very likely a legitimately ancestral segment, not a match by chance. The additional 4 segments simply increase the likelihood of a Native ancestor. In other words, for there NOT to be a Native ancestor, all 5 segments, including the large 13.4 cM segment would have to be misidentified by one of the premier scientists in the field.

Q – What did Dr. Bustamante mean by “evidence of an unadmixed Native American ancestor?”

A – Unadmixed means that the Native person was fully Native, meaning not admixed with European, Asian or African DNA. Admixture, in this context, means that the individual is a mixture of multiple ethnic groups. This is an important concept, because if you discover that your ancestor 4 generations ago was a Cherokee tribal member, but the reality was that they were only 25% Native, that means that the DNA was already in the process of being divided. If your 4th generation ancestor was fully Native, you would receive about 6.25% of their DNA which would be all Native. If they were only 25% Native, that means that while you will still receive about 6.25% of their DNA but only one fourth of that 6.25% is possibly Native – so 1.56%. You could also receive NONE of their Native DNA.

Q – Is this the same test that the major companies use?

A – Yes and no. The test itself was probably performed on the same Illumina chip platform, because the chips available cover the markers that Bustamante needed for analysis.

The major companies use the same reference data bases, plus their own internal or private data bases in addition. They do not create PCA models for each tester. They do use the same methodology described by Dr. Bustamante in terms of AIMs, along with proprietary algorithms to further define the results. Vendors may also use additional internal tools.

Q – Did Dr. Bustamante use more than one methodology in his analysis? What if one was wrong?

A – Yes, he utilized two different methodologies whose results agreed. The global ancestry method evaluates each location independently of any surrounding genetic locations, ignoring any correlation or relationship to neighboring DNA. The second methodology, known as the local ancestry method looks at each location in combination with its neighbors, given that DNA pieces are known to travel together. This second methodology allows comparisons to entire segments in reference populations and is what allows the identification of complete ancestral segments that are identified as Native or any other population.

Q – If Elizabeth’s DNA results hadn’t shown Native heritage, would that have proven that she didn’t have Native ancestry?

A – No, not definitively, although that is a possible reason for ethnicity results not showing Native admixture. It would have meant that either she didn’t have a Native ancestor, the DNA washed out, or we cannot yet detect those segments.

Q – Does this qualify Elizabeth to join a tribe?

A – No. Every tribe defines their own criteria for membership. Some tribes embrace DNA testing for paternity issues, but none, to the best of my knowledge, accept or rely entirely on DNA results for membership. DNA results alone cannot identify a specific tribe. Tribes are societal constructs and Native people genetically are more alike than different, especially in areas where tribes lived nearby, fought and captured other tribe’s members.

Q – Why does Dr. Bustamante use words like “strong probability” instead of absolutes, such as the percentages shown by commercial DNA testing companies?

A – Dr. Bustamante’s comments accurately reflect the state of our knowledge today. The vendors attempt to make the results understandable and attractive for the general population. Most vendors, if you read their statements closely and look at your various options indicate that ethnicity is only an estimate, and some provide the ability to view your ethnicity estimate results at high, medium and low confidence levels.

Q – Can we tell, precisely, when Elizabeth had a Native ancestor?

A – No, that’s why Dr. Bustamante states that Elizabeth’s ancestor was approximately 8 generations ago, and in the range of 6-10 generations ago. This analysis is a result of combined factors, including the total centiMorgans of Native DNA, the number of separate reasonably large segments, the size of the longest segment, and the confidence score for each segment. Those factors together predict most likely when a fully Native ancestor was present in the tree. Keep in mind that if Elizabeth had more than one Native ancestor, that too could affect the time prediction.

Q – Does Dr. Bustamante provide this type of analysis or tools for the general public?

A – Unfortunately, no. Dr. Bustamante’s lab is a research facility only.

Roberta’s Summary of the Analysis

I find no omissions or questionable methods and I agree with Dr. Bustamante’s analysis. In other words, yes, I believe, based on these results, that Elizabeth had a Native ancestor further back in her tree.

I would love for every tester to be able to receive PCA results like this.

However, an ethnicity confirmation isn’t all that can be done with Elizabeth’s results. Additional tools and opportunities are available outside of an academic setting, at the vendors where we test, using matching and other tools we have access to as the consuming public.

We will look at those possibilities in a second article, because Elizabeth’s results are really just a beginning and scratch the surface. There’s more available, much more. It won’t change Elizabeth’s ethnicity results, but it could lead to positively identifying the Native ancestor, or at least the ancestral Native line.

Join me in my next article for Possibilities, Wringing the Most Out of Your DNA Ethnicity Test.

In the mean time, you might want to read my article, Native American DNA Resources.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Concepts – Sibling and Twin DNA Matching

Lots of people are giving their siblings DNA test kits.  That’s a great idea, especially if your parents aren’t available for testing, because siblings do inherit part of the same DNA from their parents, but not all of the same DNA. That means testing siblings is a great opportunity for more genealogical matches!

Recently, a friend asked me why his fraternal twin has matches to people he doesn’t, and vice versa.  Great question, so let’s take a look at what to expect from matches with siblings.

First, identical twins share exactly the same DNA because they are created as a result of the division of the same egg that has been fertilized by the father’s sperm. Identical twins matches should be identical.

A fraternal twin is exactly the same as a sibling. Two separate sperm fertilize two separate eggs and they gestate together, at the same time.

Second, let’s talk just a minute about Y and mitochondrial DNA, then we’ll discuss autosomal DNA.

Full Siblings Share
Mitochondrial DNA Exactly the same, unless a mutation occurred
Y DNA Males will share exactly the same, unless a mutation occurred.  Females don’t have a Y chromosome.
Autosomal DNA Approximately 50% of autosomal DNA

To obtain detailed Y and mitochondrial DNA results, you’ll need to test with Family Tree DNA. They are the only vendor offering these tests.

For autosomal matching, you can test with a number of vendors including: Family Tree DNA, Ancestry, 23andMe and MyHeritage.

You can read more about the different kinds of testing here, and a comparison of the different tests and vendors here.

50% the Same – 50% Different

Siblings share approximately 50% of the same DNA of the parents.  The other 50% is different DNA that they received from the parents that the other sibling did not receive.

In the conceptual example above, you can see that each child inherited 4 segments of the 8 total offered by their parents.  Only two of those segments were the same for both siblings, segments 3 and 4.  Of these two siblings, no one inherited parental segments 7 and 8.  Perhaps a third child would.

In other words, siblings can expect to see many of the same people in their match list and several that are different. In our example, the same people would be matching both siblings on segments 3 and 4.  People matching child 1 but not child 2 would be matching on segments 1 and 2.  People matching child 2 but not child 1 would be matching on segments 5 and 6.

The reason you’ll see the same people on your match list is because you did inherit 50% of the same DNA from your parents.

There are two reasons you’ll see different matches on your match lists.

Some of your matches on your list that don’t match your sibling will be because the two siblings inherited different pieces of DNA from their parents.  Your sibling will match people on the DNA that they received from your parents that you didn’t receive, and vice versa.

Some Matches are Identical By Chance (IBC)

Another reason for different matches is because you and your sibling will have people on both of your match lists that don’t match either parent as a result of IBC or identical by chance matching. That’s where the DNA of your match just happens to match you by virtue of zigzagging back and forth between your Mom’s and Dad’s DNA that you carry.

As you can see in this example, your pink DNA came from your Mom, and blue from your Dad, but your match carries some of both values, T and A.  This means they match you, but not because they match either of your parents.  Just an accident of circumstance. That’s what IBC is.

Telling the Difference

I wrote about matches that are identical by descent (IBD), meaning because you inherited that DNA from your parents, and identical by chance (IBC) in this article.

Unfortunately, your DNA is mixed together and without other known relatives testing, it’s impossible to discern which DNA is inherited from your mother and which from your father. This is exactly why we encourage people to have known relatives test such as parents, grandparents and cousins.  Who you match on which segments indicates where those segments descended from in your family tree.

If one or both parents are living, that’s the best way of discerning which matches are identical by descent and which are by chance.

A recent project with Philip Gammon showed by segment size the likelihood is of a match being genuine or identical by chance.  If both parents have tested, he offers the free Match-Maker-Breaker tool to do this analysis for you.

The bottom line is that when comparing your matches to those of your siblings, about 20-25% of everyone’s total matches are identical by chance, especially those at lower centiMorgan levels.

The remaining 80% or so will be divided roughly half and half, meaning half will match you and a sibling both, and half will only match you. Therefore, you will be looking at roughly 40% of your matches being in common with a particular sibling, 40% not matching your sibling but being legitimate matches and the remaining 20% that are identical by chance.

Test Parents and Family Members

Of course, because you do share roughly half of the same DNA inherited from your parents, you will have some matches to both you and a sibling that are identical by chance in exactly the same way.  Just finding someone on both of your match lists doesn’t guarantee that the match ISN’T identical by chance.

The best way to eliminate identical by chance matching, of course, is to test your parents.  Sadly, that isn’t always possible.

The next best way to determine legitimate matches is to test other family members.  At Family Tree DNA, they provide customers with the ability to link the DNA tests of family members to their proper location in your tree, and then Family Tree DNA utilizes the common DNA segments to determine common matching between you, that family member(s), and other people.

Those people who match you and a family member on the same segment are then identified as either paternal or maternal matches, based on their position in your tree.

Identifying Lineage

When thinking about who to test, half-siblings, if you have any are, a wonderful way to differentiate between maternal and paternal matches.  Because you and a half sibling share only one parent – which side of your tree those common matches come from is immediately evident!

Of my matches at Family Tree DNA, you can see that of my total 3165 matches, 713 are paternal and 545 are maternal, with 4 being related to both sides.  Don’t get too excited about those “both sides” matches, they are my descendants!

Paternal and maternal bucketing is a great start in terms of identifying which matches are genealogical – and that’s before I do any actual genealogy work.  All I did was test, create or upload a tree and connect tested family members to that tree.

Family Tree DNA is the only vendor to offer this feature.

Ethnicity

Ethnicity is a slippery fish.  I generally only consider ethnicity estimates reliable at the continental level.  There are lots of reasons that siblings will receive somewhat different ethnicity results including the internal algorithms of the various vendors.  You can read about what is involved in ethnicity testing here.

Transfers Give You More For Your Money

If you test at one of the vendors, you may be able to transfer to other vendors as well as GedMatch.  In the chart below, you can see which vendors accept transfers from other vendors. You can read more here.

Have Fun

Lots of people are now testing their DNA and I hope you and your siblings will find some great matches among the new testers. The great thing about siblings, aside from the fact that they are your siblings, is that you can leverage each other’s DNA matches.  Just one more way to share and move the genealogy ball forward.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Concepts – Segment Size, Legitimate and False Matches

Matchmaker, matchmaker, make me a match!

One of the questions I often receive about autosomal DNA is, “What, EXACTLY, is a match?”  The answer at first glance seems evident, meaning when you and someone else are shown on each other’s match lists, but it really isn’t that simple.

What I’d like to discuss today is what actually constitutes a match – and the difference between legitimate or real matches and false matches, also called false positives.

Let’s look at a few definitions before we go any further.

Definitions

  • A Match – when you and another person are found on each other’s match lists at a testing vendor. You may match that person on one or more segments of DNA.
  • Matching Segment – when a particular segment of DNA on a particular chromosome matches to another person. You may have multiple segment matches with someone, if they are closely related, or only one segment match if they are more distantly related.
  • False Match – also known as a false positive match. This occurs when you match someone that is not identical by descent (IBD), but identical by chance (IBC), meaning that your DNA and theirs just happened to match, as a happenstance function of your mother and father’s DNA aligning in such a way that you match the other person, but neither your mother or father match that person on that segment.
  • Legitimate Match – meaning a match that is a result of the DNA that you inherited from one of your parents. This is the opposite of a false positive match.  Legitimate matches are identical by descent (IBD.)  Some IBD matches are considered to be identical by population, (IBP) because they are a result of a particular DNA segment being present in a significant portion of a given population from which you and your match both descend. Ideally, legitimate matches are not IBP and are instead indicative of a more recent genealogical ancestor that can (potentially) be identified.

You can read about Identical by Descent and Identical by Chance here.

  • Endogamy – an occurrence in which people intermarry repeatedly with others in a closed community, effectively passing the same DNA around and around in descendants without introducing different/new DNA from non-related individuals. People from endogamous communities, such as Jewish and Amish groups, will share more DNA and more small segments of DNA than people who are not from endogamous communities.  Fully endogamous individuals have about three times as many autosomal matches as non-endogamous individuals.
  • False Negative Match – a situation where someone doesn’t match that should. False negatives are very difficult to discern.  We most often see them when a match is hovering at a match threshold and by lowing the threshold slightly, the match is then exposed.  False negative segments can sometimes be detected when comparing DNA of close relatives and can be caused by read errors that break a segment in two, resulting in two segments that are too small to be reported individually as a match.  False negatives can also be caused by population phasing which strips out segments that are deemed to be “too matchy” by Ancestry’s Timber algorithm.
  • Parental or Family Phasing – utilizing the DNA of your parents or other close family members to determine which side of the family a match derives from. Actual phasing means to determine which parts of your DNA come from which parent by comparing your DNA to at least one, if not both parents.  The results of phasing are that we can identify matches to family groups such as the Phased Family Finder results at Family Tree DNA that designate matches as maternal or paternal based on phased results for you and family members, up to third cousins.
  • Population Based Phasing – In another context, phasing can refer to academic phasing where some DNA that is population based is removed from an individual’s results before matching to others. Ancestry does this with their Timber program, effectively segmenting results and sometimes removing valid IBD segments.  This is not the type of phasing that we will be referring to in this article and parental/family phasing should not be confused with population/academic phasing.

IBD and IBC Match Examples

It’s important to understand the definitions of Identical by Descent and Identical by Chance.

I’ve created some easy examples.

Let’s say that a match is defined as any 10 DNA locations in a row that match.  To keep this comparison simple, I’m only showing 10 locations.

In the examples below, you are the first person, on the left, and your DNA strands are showing.  You have a pink strand that you inherited from Mom and a blue strand inherited from Dad.  Mom’s 10 locations are all filled with A and Dad’s locations are all filled with T.  Unfortunately, Mother Nature doesn’t keep your Mom’s and Dad’s strands on one side or the other, so their DNA is mixed together in you.  In other words, you can’t tell which parts of your DNA are whose.  However, for our example, we’re keeping them separate because it’s easier to understand that way.

Legitimate Match – Identical by Descent from Mother

matches-ibd-mom

In the example above, Person B, your match, has all As.  They will match you and your mother, both, meaning the match between you and person B is identical by descent.  This means you match them because you inherited the matching DNA from your mother. The matching DNA is bordered in black.

Legitimate Match – Identical by Descent from Father

In this second example, Person C has all T’s and matches both you and your Dad, meaning the match is identical by descent from your father’s side.

matches-ibd-dad

You can clearly see that you can have two different people match you on the same exact segment location, but not match each other.  Person B and Person C both match you on the same location, but they very clearly do not match each other because Person B carries your mother’s DNA and Person C carries your father’s DNA.  These three people (you, Person B and Person C) do NOT triangulate, because B and C do not match each other.  The article, “Concepts – Match Groups and Triangulation” provides more details on triangulation.

Triangulation is how we prove that individuals descend from a common ancestor.

If Person B and Person C both descended from your mother’s side and matched you, then they would both carry all As in those locations, and they would match you, your mother and each other.  In this case, they would triangulate with you and your mother.

False Positive or Identical by Chance Match

This third example shows that Person D does technically match you, because they have all As and Ts, but they match you by zigzagging back and forth between your Mom’s and Dad’s DNA strands.  Of course, there is no way for you to know this without matching Person D against both of your parents to see if they match either parent.  If your match does not match either parent, the match is a false positive, meaning it is not a legitimate match.  The match is identical by chance (IBC.)

matches-ibc

One clue as to whether a match is IBC or IBD, even without your parents, is whether the person matches you and other close relatives on this same segment.  If not, then the match may be IBC. If the match also matches close relatives on this segment, then the match is very likely IBD.  Of course, the segment size matters too, which we’ll discuss momentarily.

If a person triangulates with 2 or more relatives who descend from the same ancestor, then the match is identical by descent, and not identical by chance.

False Negative Match

This last example shows a false negative.  The DNA of Person E had a read error at location 5, meaning that there are not 10 locations in a row that match.  This causes you and Person E to NOT be shown as a match, creating a false negative situation, because you actually do match if Person E hadn’t had the read error.

matches-false-negative

Of course, false negatives are by definition very hard to identify, because you can’t see them.

Comparisons to Your Parents

Legitimate matches will phase to your parents – meaning that you will match Person B on the same amount of a specific segment, or a smaller portion of that segment, as one of your parents.

False matches mean that you match the person, but neither of your parents matches that person, meaning that the segment in question is identical by chance, not by descent.

Comparing your matches to both of your parents is the easiest litmus paper test of whether your matches are legitimate or not.  Of course, the caveat is that you must have both of your parents available to fully phase your results.

Many of us don’t have both parents available to test, so let’s take a look at how often false positive matches really do occur.

False Positive Matches

How often do false matches really happen?

The answer to that question depends on the size of the segments you are comparing.

Very small segments, say at 1cM, are very likely to match randomly, because they are so small.  You can read more about SNPs and centiMorgans (cM) here.

As a rule of thumb, the larger the matching segment as measured in cM, with more SNPs in that segment:

  • The stronger the match is considered to be
  • The more likely the match is to be IBD and not IBC
  • The closer in time the common ancestor, facilitating the identification of said ancestor

Just in case we forget sometimes, identifying ancestors IS the purpose of genetic genealogy, although it seems like we sometimes get all geeked out by the science itself and process of matching!  (I can hear you thinking, “speak for yourself, Roberta.”)

It’s Just a Phase!!!

Let’s look at an example of phasing a child’s matches against those of their parents.

In our example, we have a non-endogamous female child (so they inherit an X chromosome from both parents) whose matches are being compared to her parents.

I’m utilizing files from Family Tree DNA. Ancestry does not provide segment data, so Ancestry files can’t be used.  At 23andMe, coordinating the security surrounding 3 individuals results and trying to make sure that the child and both parents all have access to the same individuals through sharing would be a nightmare, so the only vendor’s results you can reasonably utilize for phasing is Family Tree DNA.

You can download the matches for each person by chromosome segment by selecting the chromosome browser and the “Download All Matches to Excel (CSV Format)” at the top right above chromosome 1.

matches-chromosomr-browser

All segment matches 1cM and above will be downloaded into a CSV file, which I then save as an Excel spreadsheet.

I downloaded the files for both parents and the child. I deleted segments below 3cM.

About 75% of the rows in the files were segments below 3cM. In part, I deleted these segments due to the sheer size and the fact that the segment matching was a manual process.  In part, I did this because I already knew that segments below 3 cM weren’t terribly useful.

Rows Father Mother Child
Total 26,887 20,395 23,681
< 3 cM removed 20,461 15,025 17,784
Total Processed 6,426 5,370 5,897

Because I have the ability to phase these matches against both parents, I wanted to see how many of the matches in each category were indeed legitimate matches and how many were false positives, meaning identical by chance.

How does one go about doing that, exactly?

Downloading the Files

Let’s talk about how to make this process easy, at least as easy as possible.

Step one is downloading the chromosome browser matches for all 3 individuals, the child and both parents.

First, I downloaded the child’s chromosome browser match file and opened the spreadsheet.

Second, I downloaded the mother’s file, colored all of her rows pink, then appended the mother’s rows into the child’s spreadsheet.

Third, I did the same with the father’s file, coloring his rows blue.

After I had all three files in one spreadsheet, I sorted the columns by segment size and removed the segments below 3cM.

Next, I sorted the remaining items on the spreadsheet, in order, by column, as follows:

  • End
  • Start
  • Chromosome
  • Matchname

matches-both-parents

My resulting spreadsheet looked like this.  Sorting in the order prescribed provides you with the matches to each person in chromosome and segment order, facilitating easy (OK, relatively easy) visual comparison for matching segments.

I then colored all of the child’s NON-matching segments green so that I could see (and eventually filter the matchname column by) the green color indicating that they were NOT matches.  Do this only for the child, or the white (non-colored) rows.  The child’s matchname only gets colored green if there is no corresponding match to a parent for that same person on that same chromosome segment.

matches-child-some-parents

All of the child’s matches that DON’T have a corresponding parent match in pink or blue for that same person on that same segment will be colored green.  I’ve boxed the matches so you can see that they do match, and that they aren’t colored green.

In the above example, Donald and Gaff don’t match either parent, so they are all green.  Mess does match the father on some segments, so those segments are boxed, but the rest of Mess doesn’t match a parent, so is colored green.  Sarah doesn’t match any parent, so she is entirely green.

Yes, you do manually have to go through every row on this combined spreadsheet.

If you’re going to phase your matches against your parent or parents, you’ll want to know what to expect.  Just because you’ve seen one match does not mean you’ve seen them all.

What is a Match?

So, finally, the answer to the original question, “What is a Match?”  Yes, I know this was the long way around the block.

In the exercise above, we weren’t evaluating matches, we were just determining whether or not the child’s match also matched the parent on the same segment, but sometimes it’s not clear whether they do or do not match.

matches-child-mess

In the case of the second match with Mess on chromosome 11, above, the starting and ending locations, and the number of cM and segments are exactly the same, so it’s easy to determine that Mess matches both the child and the father on chromosome 11. All matches aren’t so straightforward.

Typical Match

matches-typical

This looks like your typical match for one person, in this case, Cecelia.  The child (white rows) matches Cecelia on three segments that don’t also match the child’s mother (pink rows.)  Those non-matching child’s rows are colored green in the match column.  The child matches Cecelia on two segments that also match the mother, on chromosome 20 and the X chromosome.  Those matching segments are boxed in black.

The segments in both of these matches have exact overlaps, meaning they start and end in exactly the same location, but that’s not always the case.

And for the record, matches that begin and/or end in the same location are NOT more likely to be legitimate matches than those that start and end in different locations.  Vendors use small buckets for matching, and if you fall into any part of the bucket, even if your match doesn’t entirely fill the bucket, the bucket is considered occupied.  So what you’re seeing are the “fuzzy” bucket boundaries.

(Over)Hanging Chad

matches-overhanging

In this case, Chad’s match overhangs on each end.  You can see that Chad’s match to the child begins at 52,722,923 before the mother’s match at 53,176,407.

At the end location, the child’s matching segment also extends beyond the mother’s, meaning the child matches Chad on a longer segment than the mother.  This means that the segment sections before 53,176,407 and after 61,495,890 are false negative matches, because Chad does not also match the child’s mother of these portions of the segment.

This segment still counts as a match though, because on the majority of the segment, Chad does match both the child and the mother.

Nested Match

matches-nested

This example shows a nested match, where the parent’s match to Randy begins before the child’s and ends after the child’s, meaning that the child’s matching DNA segment to Randy is entirely nested within the mother’s.  In other words, pieces got shaved off of both ends of this segment when the child was inheriting from her mother.

No Common Matches

matches-no-common

Sometimes, the child and the parent will both match the same person, but there are no common segments.  Don’t read more into this than what it is.  The child’s matches to Mary are false matches.  We have no way to judge the mother’s matches, except for segment size probability, which we’ll discuss shortly.

Look Ma, No Parents

matches-no-parents

In this case, the child matches Don on 5 segments, including a reasonably large segment on chromosome 9, but there are no matches between Don and either parent.  I went back and looked at this to be sure I hadn’t missed something.

This could, possibly, be an instance of an unseen a false negative, meaning perhaps there is a read issue in the parent’s file on chromosome 9, precluding a match.  However, in this case, since Family Tree DNA does report matches down to 1cM, it would have to be an awfully large read error for that to occur.  Family Tree DNA does have quality control standards in place and each file must pass the quality threshold to be put into the matching data base.  So, in this case, I doubt that the problem is a false negative.

Just because there are multiple IBC matches to Don doesn’t mean any of those are incorrect.  It’s just the way that the DNA is inherited and it’s why this type of a match is called identical by chance – the key word being chance.

Split Match

matches-split

This split match is very interesting.  If you look closely, you’ll notice that Diane matches Mom on the entire segment on chromosome 12, but the child’s match is broken into two.  However, the number of SNPs adds up to the same, and the number of cM is close.  This suggests that there is a read error in the child’s file forcing the child’s match to Diane into two pieces.

If the segments broken apart were smaller, under the match threshold, and there were no other higher matches on other segments, this match would not be shown and would fall into the False Negative category.  However, since that’s not the case, it’s a legitimate match and just falls into the “interesting” category.

The Deceptive Match

matches-surname

Don’t be fooled by seeing a family name in the match column and deciding it’s a legitimate match.  Harrold is a family surname and Mr. Harrold does not match either of the child’s parents, on any segment.  So not a legitimate match, no matter how much you want it to be!

Suspicious Match – Probably not Real

matches-suspicious

This technically is a match, because part of the DNA that Daryl matches between Mom and the child does overlap, from 111,236,840 to 113,275,838.  However, if you look at the entire match, you’ll notice that not a lot of that segment overlaps, and the number of cMs is already low in the child’s match.  There is no way to calculate the number of cMs and SNPs in the overlapping part of the segment, but suffice it to say that it’s smaller, and probably substantially smaller, than the 3.32 total match for the child.

It’s up to you whether you actually count this as a match or not.  I just hope this isn’t one of those matches you REALLY need.  However, in this case, the Mom’s match at 15.46 cM is 99% likely to be a legitimate match, so you really don’t need the child’s match at all!!!

So, Judge Judy, What’s the Verdict?

How did our parental phasing turn out?  What did we learn?  How many segments matched both the child and a parent, and how many were false matches?

In each cM Size category below, I’ve included the total number of child’s match rows found in that category, the number of parent/child matches, the percent of parent/child matches, the number of matches to the child that did NOT match the parent, and the percent of non-matches. A non-match means a false match.

So, what the verdict?

matches-parent-child-phased-segment-match-chart

It’s interesting to note that we just approach the 50% mark for phased matches in the 7-7.99 cM bracket.

The bracket just beneath that, 6-6.99 shows only a 30% parent/child match rate, as does 5-5.99.  At 3 cM and 4 cM few matches phase to the parents, but some do, and could potentially be useful in groups of people descended from a known common ancestor and in conjunction with larger matches on other segments. Certainly segments at 3 cM and 4 cM alone aren’t very reliable or useful, but that doesn’t mean they couldn’t potentially be used in other contexts, nor are they always wrong. The smaller the segment, the less confidence we can have based on that segment alone, at least below 9-15cM.

Above the 50% match level, we quickly reach the 90th percentile in the 9-9.99 cM bracket, and above 10 cM, we’re virtually assured of a phased match, but not quite 100% of the time.

It isn’t until we reach the 16cM category that we actually reach the 100% bracket, and there is still an outlier found in the 18-18.99 cM group.

I went back and checked all of the 10 cM and over non-matches to verify that I had not made an error.  If I made errors, they were likely counting too many as NON-matches, and not the reverse, meaning I failed to visually identify matches.  However, with almost 6000 spreadsheet rows for the child, a few errors wouldn’t affect the totals significantly or even noticeably.

I hope that other people in non-endogamous populations will do the same type of double parent phasing and report on their results in the same type of format.  This experiment took about 2 days.

Furthermore, I would love to see this same type of experiment for endogamous families as well.

Summary

If you can phase your matches to either or both of your parents, absolutely, do.  This this exercise shows why, if you have only one parent to match against, you can’t just assume that anyone who doesn’t match you on your one parent’s side automatically matches you from the other parent. At least, not below about 15 cM.

Whether you can phase against your parent or not, this exercise should help you analyze your segment matches with an eye towards determining whether or not they are valid, and what different kinds of matches mean to your genealogy.

If nothing else, at least we can quantify the relatively likelihood, based on the size of the matching segment, in a non-endogamous population, a match would match a parent, if we had one to match against, meaning that they are a legitimate match.  Did you get all that?

In a nutshell, we can look at the Parent/Child Phased Match Chart produced by this exercise and say that our 8.5 cM match has about a 66% chance of being a legitimate match, and our 10.5 cM match has a 95% change of being a legitimate match.

You’re welcome.

Enjoy!!

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Nine Autosomal Tools at Family Tree DNA

The introduction of the Phased Family Finder Matches has added a new way to view autosomal DNA results at Family Tree DNA and a powerful new tool to the genealogists toolbox.

The Phased Family Finder Matches are the 9th tool provided for autosomal test results by Family Tree DNA. Did you know where were 9?

Each of the different methodologies provides us with information in a unique way to assist in our relentless search for cousins, ancestors and our quests to break down brick walls.

That’s the good news.

The not-so-good news is that sometimes options are confusing, so I’d like to review each tool for viewing autosomal match information, including:

  • When to use each tool
  • How to use each tool
  • What the results mean to you
  • The unique benefits of each tool
  • The cautions and things you need to know about each tool including what they are not

The tools are:

  1. Regular Matching
  2. ICW (In Common With)
  3. Not ICW (Not In Common With)
  4. The Matrix
  5. Chromosome Browser
  6. Phased Family Matching
  7. Combined Advanced Matching
  8. MyOrigins Matching
  9. Spreadsheet Matching

You Have Options

Family Tree DNA provides their clients with options, for which I am eternally grateful. I don’t want any company deciding for me which matches are and are not important based on population phasing (as opposed to parental phasing), and then removing matches they feel are unimportant. For people who are not fully endogamous, but have endogamous lines, matches to those lines, which are valid matches, tend to get stripped away when a company employs population based phasing – and once those matches are gone, there is no recovery unless your match happens to transfer their results to either Family Tree DNA or GedMatch.

The great news is that the latest new option, Phased Family Matching, is focused on making easy visual comparisons of high quality parental matches which is especially useful for those who don’t want to dig deeply.

There are good options for everyone at all ranges of expertise, from beginners to those who like to work with spreadsheets and extract every teensy bit of information.

So let’s take a look at all of your matching options at Family Tree DNA. If you’re not taking advantage of all of them, you’re missing out. Each option is unique and offers something the other options don’t offer.

In case you’re curious, I’ll be bouncing back and forth between my kit, my mother’s kit and another family member’s kit because, based on their matches utilizing the various tools, different kits illustrate different points better.

Also, please note that you can click on any image to see a larger version.

Selecting Options

FF9 options

Your selection options for Family Finder are available on both your Dashboard page under the Family Finder heading, right in the middle of the page, and the dropdown myFTDNA menu, on the upper left, also under Family Finder.

Ok, let’s get started. 

#1 – Regular Matching

By regular matching, I’m referring to the matches you see when you click on the “Matches” tab on your main screen under Family Finder or in the dropdown box.

FF9 regular matching

Everyone uses this tool, but not everyone knows about the finer points of various options provided.

There’s a lot of information here folks. Are you systematically using this information to its full advantage?

Your matches are displayed in the highest match first order. All of the information we utilize regularly (or should) is present, including:

  • Relationship Range
  • Match Date
  • Shared CentiMorgans
  • Longest (shared) Block
  • X-Match
  • Known Relationship
  • Ancestral Surnames (double click to see entire list)
  • Notes
  • E-mail envelope icon
  • Family Tree
  • Parental “side” icon

The Expansion “+” at the right side of each match, shown below, shows us:

  • Tests Taken
  • mtDNA haplogroup
  • Y haplogroup

Clicking on your match’s profile (their picture) provides additional information, if they have provided that information:

  • Most distant maternal ancestor
  • Most distant paternal ancestor
  • Additional information in the “about me” field, sometimes including a website link

On the match page, you can search for matches either by their full name, first name, last name or click on the “Advanced Search” to search for ancestral surname. These search boxes can be found at the top right.

FF9 advanced search

The Advanced Search feature, underneath the search boxes at right, also provides you with the option of combining search criteria, by opening two drop down boxes at the top left of the screen.

FF9 search combo

Let’s say I want to see all of my matches on the X chromosome. I make that selection and the only people displayed as matches are those whom I match on the X chromosome.

You can see that in this case, there are 280 matches. If I have any Phased Family Matches, then you will see how many X matches I have on those tabs too.

The first selection box works in combination with the second selection box.

FF9 search combo 2

Now, let’s say I want to sort in Longest Block Order. That section sorts and displays the people who match me on the X chromosome in Longest Block Order.

FF9 longest block

Prerequisites

  • Take the Family Finder test or transfer your results from either 23andMe (V3 only) or Ancestry (V1 only, currently.)
  • Match must be over the matching threshold of 9cM if shared cM are less than 20, or, the longest block must be at least 7.69 cM if the total shared cM is 20 or greater.

Power Features

  • The ability to customize your view by combining search, match and sort criteria.

Cautions

  • It’s easy to forget that you’re ONLY working with X matches, for example, once you sort, and not all of your matches. Note the Reset Filter button above your matches which clears all of the sort and search criteria. Always reset, just to be on the safe side, before you initiate another sort.

FF9 reset filter

  • Please note that the search boxes and logic are in the process of being redesigned, per a conversation Michael Davila, Director of Product Development, on 7-20-2016. Currently, if you search for the name “Donald,” for example, and then do an “in common with” match to someone on the Donald match list, you’ll only see those individuals who are in common with “Donald,” meaning anyone without “Donald” as one of their names won’t show as a match. The logic will be revised shortly so that you will see everyone “in common with,” not just “Donald.” Just be aware of this today and don’t do an ICW with someone you’ve searched for in the search box until this is revised.

#2 – In Common With (ICW)

You can select anyone from your match list to see who you match in common with them.

This is an important feature because it gives me a very good clue as to who else may match me on that same genealogical line.

For example, cousin Donald is related on the paternal line. I can select Donald by clicking the box to the left of his profile which highlights his row in yellow. I can then select what I want to do with Don’s match.

FF9 ICW

You will see that Don is selected in the match selection box on the lower left, and the options for what I can do with Don are above the matches. Those options are:

  • Chromosome Browser
  • In Common With
  • Not in Common With

Let’s select “In Common With.”

Now, the matches displayed will ONLY be those that I match in common with Don, meaning that Donald and I both match these people.

FF9 ICW matches

As you can see, I’m displaying my matches in common with Don in longest block order. You can click on any of the header columns to display in reverse order.

There are a total of 82 matches in common with Don and of those, 50 are paternally assigned. We’ll talk about how parental “side” assignments happen in a minute.

Prerequisites

  • None

Power Features

  • Can see at a glance which matches warrant further inspection and may (or may not) be from a common genealogical line.

Cautions

  • An ICW match does NOT mean that the matching individual IS from the same common line – only genealogical research can provide that information.
  • An ICW matches does NOT mean that these three people, you, your match and someone who matches both of you is triangulated – meaning matching on the same segment. Only individual matching with each other provides that information.
  • It’s easy to forget that you’re not working with your entire match list, but a subset. You can see that Donald’s name appears in the box at the upper left, along with the function you performed (ICW) and the display order if you’ve selected any options from the second box.

# 3 – Not In Common With

Now, let’s say I want to see all of my X matches that are not in common with my mother, who is in the data base, which of course suggests that they are either on my father’s side or identical by chance. My father is not in the data base, and given that he died in 1963, there is no chance of testing him.

Keep in mind though that because X matches aren’t displayed unless you have another qualifying autosomal segment, that they are more likely to be valid matches than if they were displayed without another matching segment that qualifies as a match.

For those who don’t know, X matches have a unique inheritance pattern which can yield great clues as to which side of your tree (if you’re a male), and which ancestors on various sides of your tree X matches MUST come from (males and females both.) I wrote about this here, along with some tools to help you work with X matches.

To utilize the “Not In Common With” feature, I would select my mother and then select the “Not In Common With” option, above the matches.

FF9 NICW

I would then sort the results to see the X matches by clicking on the top of the column for X-Match – or by any other column that I wanted to see.

FF9 NICW X

I have one very interesting not in common with match – and that’s with a Miller male that I would have assumed, based on the surname, was a match from my mother’s side. He’s obviously not, at least based on that X match. No assuming allowed!

Prerequisites

  • None

Power Features

  • Can see at a glance which matches warrant further inspection and may be from a common genealogical line – or are NOT in common with a particular person.

Cautions

  • Be sure to understand that “not in common with” means that you, the person you match and the list of people shown as a result of the “Not ICW” do not all match each other.  You DO match the person on your match list, but the list of “not in common with” matches are the people who DON’T match both of you.  Not in common with is the opposite of “in common with” where your match list does match you and the person you’re matching in common with.
  • The X and other chromosome matches may be inherited from different ancestors. Every matching segment needs to be analyzed separately.

#4 – The Matrix

Let’s say that I have a list of matches, perhaps a list of individuals that I found doing an ICW with my cousin, and I wonder if these people match each other. I can utilize the Matrix grid to see.

Going back to the ICW list with cousin Donald, let’s see if some of those people match each other on the Matrix.

Let’s pick 5 people.

I’m selecting Cheryl, Rex, Charles, Doug and Harold.

Margaret Lentz chart

I’m making these particular selections because I know that all of these people, except Harold, are related to my mother, Barbara, shown on the bottom row of the chart above.  This chart, borrowed from another article (William is not in this comparison), shows how Cheryl, Rex, Charles and Barbara who have all DNA tested are related to each other.  Some are related through the Miller line, some through the dual Lentz/Miller line, and some just from the Lentz line.  Doug is related through the Miller line only, and at least 4 generations upstream. Doug may also be related through multiple lines, but is not descended from the Lentz line.

The people I’ve selected for the matrix are not all related to each other, and they don’t all share one common ancestral line.

Harold is a wild card – I have no idea how he is related or who he is related to, so let’s see what we can determine.

FF9 Matrix choices

As you make selections on the Matrix page, up to 10 selections are added to the grid.

FF9 Matrix grid

You can see that Charles matches Cheryl and Harold.

You can see that Rex matches Charles and Cheryl and Harold.

You can see that Doug matches only Cheryl, but this isn’t surprising as the common line between Doug and the known cousins is at least 4 generations further back in time on the Miller line.

The known relationship are:

  • Don and Cheryl are siblings, descended from the Lentz/Miller.
  • Rex is a known cousin on the Miller/Lentz line
  • Charles is a known cousin on the Lentz line only
  • Doug is a known cousin on the Miller line only

Let me tell you what these matches indicate to me.

Given that Harold matches Rex and Charles and Cheryl, IF and that’s a very big IF, he descends from the same lines, then he would be related to both sides of this family, meaning both the Miller and Lentz lines.

  • He could be a downstream cousin after the Lentz and Miller lines married, meaning a descendant of Margaret Lentz and John David Miller, or other Miller/Lentz couples
  • He could be independently related to both lines upstream. They did intermarry.
  • He could be related to Charles or Rex through an entirely separate line that has nothing to do with Lentz or Miller.

So I have no exact answer, but this does tell me where to look. Maybe I could find additional known Lentz or Miller line descendants to add to the Matrix which would provide additional information.

Prerequisites

  • None

Power Features

  • Can see at a glance which matches match each other as well.

Cautions

  • Matrix matches do NOT mean that these individuals match on the same segments, it just means they do match on some segment. A matrix match is not triangulation.
  • Matrix matches can easily be from different lines to different ancestors. For example, Harold could match each one of three individuals that he matches on different ancestral lines that have nothing to do with their common Lentz or Miller line.

#5 – Chromosome Browser

I want to know if the 5 individuals that I selected to compare in the Matrix match me on any of the same segments.

I’m going back to my ICW list with cousin Donald.

I’ve selected my 5 individuals by clicking the box to the left of their profiles, and I’m going to select the chromosome browser.

FF9 chromosome browser choices

The chromosome browser shows you where these individuals match you.

Overlapping segments mean the people who overlap all match you on that segment, but overlapping segments do NOT mean they also match each other on these same segments.

Translated, this means they could be matching you on different sides of your family or are identical by chance. Remember, you have two sides to your chromosome, a Mom’s side and a Dad’s side, which are intermingled, and some people will match you by chance. You can read more about this here.

The chromosome browser shows you THAT they match you – it doesn’t tell you HOW they match you or if they match each other.

FF9 chromosome browser view2

The default view shows matches of 5cM or greater. You can select different thresholds at the top of the comparison list.

You’ll notice that all 5 of these people match me, but that only two of them match me on overlapping segments, on chromosome 3. Among those 5 people, only those who match me on the same segments have the opportunity to triangulate.

This gives you the opportunity to ask those two individuals if they also match each other on this same chromosome. In this case, I have access to both of those kits, and I can tell you that they do match each other on those segments, so they do triangulate mathematically. Since I know the common ancestor between myself, Cheryl and Rex, I can assign this segment to John David Miller and Margaret Lentz. That, of course, is the goal of autosomal matching – to identify the common ancestor of the individuals who match.

You also have the option to download the results of this chromosome browser match into a spreadsheet. That’s the left-most download option at the top of the chromosomes. We’ll talk about how to utilize spreadsheets last.

The middle option, “view in a table” shows you these results, one pair of individuals at a time, in a table.

This is me compared to Rex. You will have a separate table for each one of the individuals as compared to you. You switch between them at the bottom right.

FF9 chromosome browser table2

The last download option at the furthest right is for your entire list of matches and where they match you on your chromosomes.

Prerequisites

  • None

Power Features

  • Can visually see where individuals and multiple people match you on your chromosomes, and where they overlap which suggests they may triangulate.

Cautions

  • When two people match you on the same chromosome segment, this does not mean that they also match each other on that segment. Matching on overlapping segments is not triangulation, although it’s the first step to triangulation.
  • For triangulation, you will need to contact your matches to determine if they also match each other on the same segment where they both match you. You may also be able to deduce some family matching based on other known individuals from the same line that you also match on that same segment, if your match matches them on that segment too.
  • The chromosome browser is limited to 5 people at a time, compared to you. By utilizing spreadsheet matching, you can see all of your matches on a particular segment, together.

#6 – Phased Family Matching

Phased Family Matching is the newest tool introduced by Family Tree DNA. I wrote about it here. The icons assigned to matches make it easy to see at a glance which side of your family, maternal or paternal, or both, a match derives from.

ff9 parental iconPhased Family Matching allows you to link the DNA results of qualified relatives to your tree and by doing so, Family Tree DNA assigns matches to maternal or paternal buckets, or sometimes, both, as shown in the icon above.

This phased matching utilizes both parental phasing in addition to a slightly higher threshold to assure that the matches they assign to parental sides can be done so with confidence. In order to be assigned a maternal or paternal icon, your match must match you and your qualifying relative at 9cM or greater on at least one of the same segments over the matching threshold. This is different than an ICW match, which only tells you that you do match, not how you match or that it’s on the same segment.

Qualifying relatives, at this time, are parents, grandparents, uncles, aunts and first cousins. Additional relatives are planned in the near future.

Icons are ONLY placed based on phased match results that meet the criteria.

These icons are important because they indicate which side of your family a match is from with a great deal of precision and confidence – beyond that of regular matching.

This is best illustrated by an example.

Phased FF2

In this example, this individual has their father and mother both in the system. You can see that their father’s side is assigned a blue icon and their mother’s side is assigned a pink (red) icon. This means they match this person on only one side of their family.  A purple icon with both a male and female image means that this person is related to you on both sides of your family.  Full siblings, when both parents are in the system to phase against, would receive both icons.

This sibling is showing as matching them on both sides of their family, because both parents are available for phasing.

If only one parent was available, the father, for example, then the sibling would only shows the paternal icon. The maternal icon is NOT added by inference. In Phased Family Matching, nothing is added by inference – only by exact allele by allele matching on the same segment – which is the definition of parentally phased matching.

These icons are ONLY added as a result of a high quality phased matches at or above the phased match threshold of 9cM.

You can read more about the Family Matching System in the Family Tree DNA Learning Center, here.

Prerequisites

  • You must have tested (or transferred a kit) for a qualifying relative. At this time qualifying relatives parents, grandparents, aunts, uncles and first cousins.
  • You must have uploaded a GEDCOM file or created a tree.
  • You must link the DNA of qualifying kits to that person your tree. I provided instructions for how to do this in this article.
  • You must match at the normal matching threshold to be on the match list, AND then match at or above the Phased Family Match threshold in the way described to be assigned an icon.
  • You must match on at least one full segment at or above 9cM.

Power Features

  • Can visually see which side of your family an individual is related to. You can be confident this match is by descent because they are phased to your parent or qualifying family member.

Cautions

  • If someone does not have an icon assigned, it does NOT mean they are not related on that particular side of the family. It only means that the match is not strong enough to generate an icon.
  • If someone DOES match on a particular side of the family, you will still need to do additional matching and genealogy work to determine which ancestor they descend from.
  • If someone is assigned to one side of your family, it does NOT preclude the possibility that they have a smaller or weaker match to your other side of the family.
  • If you upload a new Gedcom file after linking DNA to people in your tree, you will overwrite your DNA links and will have to relink individuals.
  • Having an icon assigned indicates mathematical triangulation for the person who tested, their parents or close relative against whom they were phased and their match with the icon.  However, technically, it’s not triangulation in cases where very close relatives are involved.  For example, parents, aunts, uncles and siblings are too closely related to be considered the third leg of the triangulation stool.  First cousins, however, in my opinion, could be considered the third leg of the three needed for triangulation.  Of course when triangulation is involved, more than three is always better – the more the merrier and the more certain you can be that you have identified the correct ancestor, ancestral couple, or ancestral line to assign that particular triangulated segment to.

# 7 – Combined Advanced Matching

One of the comparison tools often missed by people is Combined Advanced Matching.

Combined matching is available through the “Tools and Apps” button, then select “Advanced Matching.”

Advanced Matching allows you to select various options in combination with each other.

For example, one of my favorites is to compare people within a project.

You can do this a number of ways.

In the case of my mother, I’ll select everyone she matches on the Family Finder test in the Miller-Brethren project. This is a very focused project with the goal of sorting the Miller families who were of the Brethren faith.

FF9 combined matching

You can see that she has several matches in that project.

You can select a variety of combinations, including any level of Y or mtDNA testing, Family Finder, X matching, projects and “last name begins with.”

One of the ways I utilize this feature often is within a surname project, for males in particular, I select one Y level of matching at a time, combined with Family Finder, “show only people I match on all tests” and then the project name. This is a quick way to determine whether someone matches someone on Family Finder that is also in a particular surname project. And when your surname is Smith, this tool is extremely valuable. This provides a least a hint as to the possible distance to a common ancestor between individuals.

Another favorite way to utilize this feature is for non-surname projects like the American Indian project. This is perfect for people who are hunting for others with Native roots that they match – and you can see their Y and mtDNA haplogroups as a bonus!

Prerequisites

  • Must have joined the particular project if you want to use the project match feature within that project.

Power Features

  • The ability to combine matching criteria across products.
  • The ability to match within projects.
  • The ability to specify partial surnames.

Cautions

  • If you match someone on both Family Finder and either Y or mtDNA haplogroups, this does NOT mean that your common Family Finder ancestor is on that haplogroup line. It might be a good place to begin looking. Check to see if you match on the Y or mtDNA products as well.
  • All matches have their haplogroup displayed, not just IF you also match that haplogroup, unless you’ve specified the Y or mtDNA options and then you would only see the people you match which would be in the same major haplogroup, although not always the same subgroup because not everyone tests at the same level.
  • Not all surname project administrators allow people who do not carry that surname in the present generation to join their projects.

# 8 – MyOrigins Matching

One tool missed by many is the MyOrigins matching by ethnicity. For many, especially if you have all European, for example, this tool isn’t terribly useful, but if you are of mixed heritage, this tool can be a wonderful source of information.

Your matches (who have authorized this type of matching) will be displayed, showing only if they match you on your major world categories.  Only your matching categories will show.  For example, if my match, Frances, also has African heritage and I do not, I won’t see Frances’s African percentage and vice versa.

FF9 myOrigins

In this example, the person who tested falls into the major categories of European and Middle Eastern. Their matches who fall into either of these same categories will be displayed in the Shared Origins box. You may not be terribly excited about this – unless you are mixed African, Asian, European and Native American – and you have “lost ancestors” you can’t find. In that case, you may be very excited to contact other matches with the same ethnic heritage.

When you first open your myOrigins page, you will be greeted with a choice to opt in (by clicking) or to opt out (by doing nothing) of allowing your ethnic matches to view the same ethnic groups you carry. Your matches will not be able to see your ethnic groups that they don’t have in common with you.

FF9 myorigins opt in

You can also access those options to view or change by clicking on Account Settings, Privacy and Sharing, and then you can view or change your selection under “My DNA Results.”

FF9 myorigins security

Prerequisites

  • Must authorize Shared Origins matching.

Power Features

  • The ability to discern who among your matches shares a particular ethnicity, and to what degree.

Cautions

  • Just because you share a particular ethnicity does NOT mean you match on the shared ethnic line. Your common ancestor with that person may be on an entirely unrelated line.

# 9 – Spreadsheet Matching

Family Tree DNA offers you the ability to download your entire list of matches, including the specific segments where your matches match you, to a spreadsheet.

This is the granddaddy of the tools and it’s a tool used by all serious genetic genealogists. It’s requires the most investment from you both in terms of understanding and work, but it also yields the most information.

The power of spreadsheet comparisons isn’t in the 5 people I pushed through to the chromosome browser, in and of themselves, but in the power of looking at the locations where all of your matches match you and known relatives on particular segments.

Utilizing the chromosome browser, we saw that chromosome 3 had an overlap match between Rex (green) and Cheryl (blue) as compared to my mother (background chromosome.)

FF9 chr 3

We see that same overlap between Cheryl and Rex when we download the match spreadsheet for those 5 people.

However, when we download all of my mother’s matches, we have a much more powerful view of that segment, below. The 2 segments we saw overlapping on the chromosome browser are shown in green. All of these people colored pink match my mother on some part of the 37cM segment she shares with Rex.

FF9 spreadsheet match

This small part of my master spreadsheet combines my own results, rows in white, with those of my mother, rows in pink.

In this case, I only match one of these individuals that mother also matches on the same segment – Rex. That’s fine. It just means that I didn’t receive the rest of that DNA from mother – meaning the portions of the segments that match Sam, Cheryl, Don, Christina and Sharon.

On the first two rows, I did receive part of that DNA from mother, 7.64 of the 37cMs that Rex matches to Mom at a threshold of 5cM.

We know that Cheryl, Don and Rex all share a common ancestor on mother’s father’s side three generations removed – meaning John David Miller and Margaret Lentz. By looking at Cheryl, Don and Rex’s matches as well, I know that several of her matches do triangulate with Cheryl, Don and/or Rex.

What I didn’t know was how Christina fit into the picture. She is a new match. Before the new Phased Family Matching, I would have had to go into each account, those of Rex, Cheryl and Don, all of which I manage, to be sure that Christina matched all of them individually in addition to Mom’s kit.

I don’t have to do that now, because I can utilize the phased Family Matching instead. The addition of the Family Matching tool has taken this from three additional steps, assuming I have access to all kits, which most people don’t, to one quick definitive step.

Cheryl and Don are both mother’s first cousins, so matches can be phased against them. I have linked both of them to mother’s kit so she how has several individuals who are phased to Don and Cheryl which generate paternal icons since Don and Cheryl are related to mother on her father’s side.

Now, instead of looking at all of the accounts individually, my first step is to see if Christina has a paternal icon, which, in this case, means she phased against either Don and/or Cheryl since those are the only two people linked to mother who qualify for phasing, today.

FF9 parental phased match

Look, Christina does have a paternal icon, so I can add “Dad” into the side column for Christine in the spreadsheet for mother’s matches AND I know Christina triangulates to Mom and either Cheryl or Don, which ever cousin she phased against.

FF9 Christina chr 3

I can see which cousin she phased against by looking at the chromosome browser and comparing mother against Cheryl, Don and Christina.  As it turns out, Christina, in green, above, phased against both Cheryl and Don whose results are in orange and blue.

It’s a great day in the neighborhood to be able to use these tools together.

Prerequisites

  • Must download matches spreadsheet through the chromosome browser, adding new matches to your spreadsheet as they occur.
  • Must have a familiarity with Excel or another spreadsheet.
  • Must learn about matching, match groups and triangulation.

Power Features

  • The ability to control the threshold you wish to work with. For matches over the match threshold, Family Tree DNA provides all segment matches to 1cM with a total of 500 SNPs.
  • The ability to see trends and groups together.
  • The ability to view kits from all of your matches for more powerful matching.
  • The ability to combine your results with those of a parent (or sibling if parents not available) to see joint matching where it occurs.

Cautions

  • There is a comparatively steep learning curve if you’re not familiar with using spreadsheets, but it’s well worth the effort if you are serious about proving ancestors through triangulation.

Summary

I’m extremely grateful for the full complement of tools available at Family Tree DNA.

They provide a range of solutions for users at all levels – people who just want to view their ethnicity or to utilize matches at the vendor site as well as those who want tools like a chromosome browser, projects, ICW, not ICW, the Matrix, ethnicity matching, combined advanced matching and chromosome browser downloads for those of us who want actual irrefutable proof.  No one has to use the more advanced tools, but they are there for those of us who want to utilize them.

I’m sorry, I’m not from Missouri, but I still want to see it for myself. I don’t want any vendor taking the “trust me” approach or doing me any favors by stripping out my data. I’m glad that Family Tree DNA gives us multiple options and doesn’t make one size fit all by using a large hammer and chisel.

The easier, more flexible and informative Family Tree DNA makes the tools, the easier it will be to convince people to test or download their data from other vendors. The more testers, the better our opportunity to find those elusive matches and through them, ancestors.

The Concepts Series

I’ve been writing a “Concepts” series of articles. Recent articles have been about how to utilize and work with autosomal matches on a spreadsheet.

You might want to read these Concepts articles if you’re serious about working with autosomal DNA.

Concepts – How Your Autosomal DNA Identifies Your Ancestors

Concepts – Identical by…Descent, State, Population and Chance

Concepts – CentiMorgans, SNPs and Pickin’ Crab

Concepts – Parental Phasing

Concepts – Downloading Autosomal Data from Family Tree DNA

Concepts – Managing Autosomal DNA Matches – Step 1 – Assigning Parental Sides

Please join me shortly for the next Concepts article – Step 2 – Who’s Related to Whom?

In the meantime:

  • Make full use of the autosomal tools available at Family Tree DNA.
  • Test additional relatives meaning parents, grandparents, aunts, uncles, half-siblings, siblings, any cousin you can identify and talk into testing.
  • Take test kits to family reunions and holiday gatherings. No, I’m not kidding.
  • Don’t forget Y or mtDNA which can provide valuable tools to identify which line you might have in common, or to quickly eliminate some lines that you don’t have in common. Some cousins will carry valuable Y or mtDNA of your direct ancestral lines – and that DNA is full of valuable and unique information as well.
  • Link the DNA kits of those individuals you know to their place in your tree.
  • Transfer family kits from other vendors.

The more relatives you can identify and link in the system, the better your chances for meaningful matches, confirming ancestral relations, and solving puzzles.

Have fun!!!

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Concepts – Parental Phasing

I recently used a technique called parental phasing as part of the proof that one Curtis Lore found in Pennsylvania was the same person as Curtis Benjamin Lore, found later in Indiana.  Given that I’ve already used parental phasing as part of a proof argument, I’d like to break it down further and explain the concepts behind parental phasing, what it is, why it is so important, and why it works so well.

For those of you who don’t have at least one parent available to test, I’m truly sorry, and not just because of the lost DNA opportunity. But please do read this article, because you may be able to substitute other family members and derive at least some of the benefits, although clearly not all.

What is Parental Phasing?

The fundamental concept of parental phasing is that the only way you can obtain your DNA is through one or the other of your parents, so every one of your matches should match you plus one of your parents. Right?

Should, yes, but that’s not exactly how autosomal matching works in real life.

You can match someone in one of two ways:

  1. Because you received the matching segment from one of your two parents, and they received that same segment from one of their two parents, a circumstance that is called identical by descent or IBD.
  2. Because your match’s DNA is zigzagging back and forth between the DNA you inherited from both of your parents, or your DNA is zigzagging back and forth between their parents, either of which is called identical by chance or IBC.

I wrote about his in the article titled, Concepts – Identical by…Descent, State, Population and Chance.

Here’s the matching “Identical By” cheat sheet since you may find it helpful in this article as well.

Identical by Chart

How Does Parental Phasing Work?

Parental phasing works by comparing your DNA against your matches DNA, then comparing your matches DNA against your parents DNA, and telling you which, if either, or both, parents they match in addition to you. Oh yes, and there’s one more tiny tidbit – they must match you and your parent(s) on the same segment(s).

As bizarre as it sounds, sometimes your match will match you on one segment, and match your parents on an entirely different segment.  While this was not an expected finding, it does happen, and frequently enough that it was found in every parental phasing test run – so it’s not an anomaly or something so rare you won’t see it.

Therefore, parental phasing may be a two part process, where:

  • Step 1 is determining whether or not your match matches either or both of your parents.
  • Step 2 is determining if your match matches you and your parent on the same segment(s), or at least part of the same segment? If not, then it’s not a phased IBD match – even though they do match you and your parent.

Conceptually, each of your matches will fall nice and cleanly into one, or both, of your parent’s buckets. Let’s look at a couple of examples.  For each of the people who match you, they will also match your parents on the same segment as follows:

Match Matches Your Mother Matches Your Father Matches Neither Parent Comment
Susie Yes No From Mom’s side, IBD
John No Yes From Dad’s side, IBD
Bob Yes Yes Matches both parents lines, IBD and may be IBP
Roxanne No No Yes Identical by Chance, IBC

Please Note: Your match list will change if you change your matching threshold, and so will your phased matches to your parents.  In other words, while someone might not match you and a parent both on the same segment at 15cM, you might well match on a common segment at a 10, 7 or 5cM threshold.

So in essence, parental phasing puts your matches into very useful buckets for you and helps eliminate false positives – or matches that appear real but aren’t.

How Can Someone Match Me But Not My Parents?

That’s a really good question. Sometimes you match someone because you received common DNA from an ancestor, through your parents, which means you’re identical by descent (IBD), a legitimate genealogical match.  But other times, you match someone just by chance because their DNA is matching pieces of both of your parents’ DNA, and not because you actually share a common ancestor.

Let’s take a look.

This first graphic shows you with an identical by descent match to your match’s father’s DNA. Your match’s father shares a common relative with (at least) one of your mother’s lines.

Phase IBD

In the most basic terms, an identical by descend (IBD) match looks like this, where your match is matching you on one of your parent’s strands of DNA. Both matching strands are colored green in this example.

Of course, your DNA does not come labeled as to which side is mother’s and which side is father’s. You can read more about that here. If it did, we wouldn’t even need to be having this discussion at all – because that’s what parental phasing does.  It tells you which side of your family your DNA match came from.

You can see in the above example that you and your match both share an actual strand of DNA. You inherited yours from your Mom and your match inherited theirs from their Dad, which means your Mom and their Dad share a common ancestor.  However, to be able to discern that fact, that your Mom and your match’s Dad share a common ancestor, you need to be able to phase the DNA of both you and your match to know which parent that strand came from.

In reality, your DNA and their DNA is entirely mixed in each of you, shown in the chart below, and without additional information, neither of you will know which strand of DNA you match on, or who you inherited it from.  Initially, you will only know THAT you match.

Phase IBD2

So here’s what your DNA really looks like. It’s up to the DNA matching software to look at the two strands of your DNA that’s mixed together, and the two strands of your match’s DNA that’s mixed together and see if there is a common grouping of DNA at each location that extends for at least 10 locations in length, which is the “threshold” for our example that signifies a match that is likely to be “real” versus IBC, or identical by chance.  In my example, that common grouping is the green “Matching Portions” column, above.

An identical by chance match looks like the chart below. You can see that the green matching DNA is zigzagging back and forth between your parents’ DNA.

Phase IBC

It can even be worse where your match’s Mom’s and Dad’s DNA is also zigzagging back and forth, but you can certainly get the idea that there are all kinds of ways to NOT match but only three ways to legitimately match – Mom’s side, Dad’s side, or both.

So you can see that indeed, you do technically match, but not because you share a DNA segment of any size with one parent, but because your match’s DNA matches part of your Mom’s DNA and part of your Dad’s, which means that DNA segment does NOT come from one common ancestor, meaning not IBD. However, the matching software can’t tell the difference, because your strands aren’t coded to Mom and Dad.

What parental phasing does is to assign your matches to “sides” or buckets based on whether they match your Mom or Dad in addition to you.

One Parent Matches

In my case, I only have one parent whose DNA is available. Therefore, all of my matches will either match both my mother and me, or not.  The balance that do not match me and my mother, both, will either match to my father or will be IBC, identical by chance matches.  Unfortunately, just by utilizing one-parent phasing, I can’t tell if the “non-Mom” matches are really to my father or are IBC.

Let’s look at an example.

Match Mom’s Side Dad or IBC Comment
Denny Yes Probably not Mom’s side, could also match on Dad’s side but we have no way to tell. My parents lines come from different parts of the world except that they both married into Native American lines.
Sally No Yes Can’t tell whether Dad’s side or IBC
Derrell No Yes Also matches cousin on Dad’s side on same segments, so Derrell is assigned to Dad’s side pending triangulation.

By using the ICW tool at Family Tree DNA, shown below, I can see who matches me and my matches, both – in this case, me and my mother.

No Parent Matches

If I have no parents in the system, but several other close family members, like uncles or cousins, I can easily see who else I match in common with my match.

In other words, without my mother to match, Denny will either match my Mom’s side family members, and I can tentatively group him there, my Dad’s side family members, and I can tentatively group him there, or neither, in which case I can’t do anything with him except note that fact.

An Example

I’m going to use my proven cousin Denny for my examples, because that’s who I used in my Curtis Lore case study and our connection is proven both genetically and genealogically.

Here’s Denny’s match list. My mother is Denny’s closest match and I’m his second closest.

Phase match list

Therefore, I can use the ICW technique to effectively put my matches into buckets that divide my DNA in half, if I have both parents.

If I have one parent, I can fill one bucket for sure by putting everyone who matches both my mother and me into the “mother” bucket. The balance will be in the “Father +IBC” bucket.

This is easy to do at Family Tree DNA by using the crossed arrow ICW tool to find everyone who matches me in common with my mother.

Phase iCW

If I don’t have either parent, but I have an uncle or a cousin, I can still assign some matches to buckets by utilizing this same ICW tool. What I can’t do without both parents is to eliminate IBC or identical by chance matches from my match list.  I need both parents or at least well fleshed out match groups to do that.  There are examples of using match groups to identify IBC matches in the article, Identical By…Descent, Chance, Population and State.

Furthermore, I will need to download my match lists for both my mother and myself to verify that each person matches both my mother and myself on a common segment.

Testing the Theory

Let’s use my real life example and see how this works. I’m going to utilize three generations, because this gives us the ability to see the parental phasing work twice.  In this illustration, below, four people have tested, Denny, Mother, Me and My Child.

Phase pedigree

Denny and my child, who are 3rd cousins once removed, match on the following DNA segments, utilizing the Family Tree DNA chromosome browser.  We are comparing against Denny, meaning he is the “background” black chromosome.  The orange illustrates where my child matches Denny.

Phase browser denny child

There are no matching segments on chromosomes 18-22.  I have not included X chromosome matching.

Here’s the same information in chart format.

Phase chart denny child

You can see that Denny and my child have several fairly significant segment matches, along with some smaller ones too. The question is, which of those segments are legitimate, meaning IBD and which are not, meaning IBC?

Let’s phase my child against my DNA and see which of these segment matches hold up.

My child is orange, and I am blue and we are both matching against cousin Denny.

phase browser denny child me

As you can see, many of those segments are legitimate because Denny matches both me and my child on the same segments. So they are not IBC, or identical by chance, but IBD, identical, literally, by descent – because my child received them from me.

In some cases, Denny matches only me, blue, which is fine because all that means is that either our matches are IBC or I didn’t pass that DNA to my child. Both matches on chromosome 3 are to me (blue) and not to my child (orange).

However, in the cases where Denny matches my child (orange,) and not me (blue,) on the same segments, that means that either Denny and my child share an ancestor that is through my child’s father or the matches are IBC.  Those matches are not through me.  In other words, those segments did not pass phasing.  You can see examples of that on chromosomes 1, 4 and 14, and partial matches on 11 and 12.

Chromosome 16 shows a really good example of a crossover event where my child, orange, received part of my DNA, blue, but about half way through my segment, it was divided and my child inherited part of mine and the other half from their father.  So, visually, you can see that my child only matches Denny on about half of the segment where I match Denny.

Matches Spreadsheet

I downloaded the results of both Denny’s matches to me and Denny’s matches to my child into one Matches Spreadsheet and have color coded them so that you can see the relationships.  If Denny matches both me and my child, you will see a common segment on that chromosome for both me and my child in the spreadsheet.  Rows where Denny matches my child are light orange and rows where Denny matches me are light blue, similar to the chromosome browser colors.

Denny Me Child

There are only three possible conditions and I have colored the chromosome column accordingly:

  • Denny matches me only – dark teal – may be a legitimate match but we don’t have enough information to tell at this point
  • Denny matches my child only, but not me – red – NOT a legitimate match – identical by chance (IBC)
  • Denny matches me and my child both – boxed green – a legitimate identical by descent (IBD) match

You’ll note that some of these matches are exact. For example on the first matching segment of chromosome 2, below, my child received this entire segment of my DNA.  It was not divided at all.

Denny Me Child 2

However, in the next two matching groups on chromosome 2, my child received most of the DNA I share with Denny, but some was shaved off, but not half.

Denny Me Child 2 shaved

On chromosome 16, my child received almost exactly half of the DNA segment that I share with Denny.

Denny Me Child 16

On chromosomes 11 and 17, my child shares more DNA with Denny than I do, which means that all of that DNA isn’t ancestral though me. In this case, either there are some fuzzy boundaries, a read error, part of the DNA is IBD and part is IBC or part of the DNA is matching through both parents.

Denny Me Child 17 c

On chromosome 14, I match Denny, but my child received none of that DNA, which is why I’ve added the color teal.

Denny Me Child 14 c

Now, let’s phase me against my mother and see how the DNA matches hold up in a third generation.

Adding the Next Generation

The view of the chromosome browser below shows Denny matching my child, in orange, me in blue and my mother in green.

Amazingly, many of these segments follow through all three generations.

phase browser denny child me mother

Let’s see how the various matches stacked up, pardon the pun.

I’ve added Denny’s matches to mother to the Matches Spreadsheet and her rows are colored green.

On the Matches Spreadsheet from the first example, there were several segments where Denny matched only me and not my child. They were colored teal.  In the chart below, so we can track those segments, I have colored them teal in the matchname column, and you can see the resolution of how they did or didn’t survive phasing against my mother in the chromosome column.

Of those 11 segments, 2 phased with my mother, the rest did not. That makes sense, since none of those are segments I passed on to my child, so they would be more likely to be IBC.

Denny me Child Mom SS

The legend for the spreadsheet above is as follows:

  • Dark teal in chromosome column – Denny matches Mom only – may be a legitimate match but we don’t have enough information to know (chromosomes 1, 2, 4, 5, 6, 7, 9, 12 and 15)
  • Dark teal in matchname column, plus red in chromosome column – previously Denny matched only me, now I do not phase against my mother, so this is an IBC match (chromosomes 1, 3, 4, 5, 6, 7, 10, 12 and 17)
  • Dark teal in matchname column, plus green box in chromosome column – previously Denny only matched me, but now this segment is parentally phased and considered legitimate (chromosomes 2 and 10)
  • Red in chromosome column – does not phase against parent, so not a legitimate match – IBC (chromosomes 1, 3, 4, 5, 6, 7, 10, 11, 12, 14 and 17)
  • Green box indicates a phased match – considered IBD and legitimate (chromosomes 1, 2, 10, 14, 15, 16 and 17)

Anomalies

*So what the heck happened with chromosome 11?

In the first example, this segment received a green box because Denny matched both me and my child on a partial segment, which means that partial segment is phased and considered legitimate.

denny me child mom ss 11 grn

When we moved to the next generation, phasing against my mother, Denny does not match my mother on this segment, so it could NOT have arrived in me and my child via my mother, so it is not IBD, even though it appeared that way initially. Because of this, I’ve changed the box color to red for a non-IBD match.

Denny me Child Mom SS 11

How could this happen?

First, it’s a very small segment overlap match, and second, Denny matched more to my child than to me, which is a neon warning sign that this segment match is suspect, especially those two conditions in combination with each other.

Here’s an example of how, genetically, a match could phase with a parent in one generation, but not hold into the next generation.

phase n o phase

This match matches both me and my child (gold), but not my mother, who has no gold. As you can see, the match does accrue 10 gold location matches in a row, but not 10 green ones, so doesn’t match my mother.  The larger the number of locations in a row required to be considered a match, the less likely this type of random matching will be to occur.

This is both the purpose and the quandry of thresholds.  Finding that sweet spot that doesn’t eliminate real matches, but is high enough to be useful in eliminating false positive (IBC) matches.  And I can tell you, there are just about as many opinions on what that threshold number should be as there are people giving opinions – and everyone seems to have one!  You can read more about this in the article, Concepts – CentiMorgans, SNPs and Pickin’ Crab.

Segment Survival

Let’s take a look and see how many of which size segments survived parental phasing.  Are some of those smaller segments legitimate matches, or did we lose them in phasing?

The chart below shows the results in segment size order, color coded as follows:

  • Red = segments that did not phase and were IBC
  • Teal = segments that match Mom only and may or may not be valid. We don’t have any way to know without additional matches.
  • Green = segments that phased and are IBD

Phased cMs by size

As you would expect, all of the larger segments phased, but surprisingly, so did several of the smaller segments, through three generations.

Given the fact that teal matches did not phase, for the most part, in the previous example, and given that the teal segments are mostly small, my suspicion would be that most of  these teal segments would not phase (with the probable exception of the 10.27 cm segment), if we have the opportunity to find out – which we don’t.

This example is for a non-endogamous line, or better stated, with distant endogamous groups in multiple lines. Endogamous results would probably be different.

Statistics

What do our statistics look like?

There were 58 matching segments between Denny, my child, me and my mother.

  Match To Whom # Segments # Phased %
Denny My Child 12 8 75
Denny Me 22 11 50
Denny Mother 24 Probably at least 11
Total 58

Of those 58 total matches, 16 were IBC meaning they did not match up through my mother.

  Total

Segment Matches

IBC (no phase) IBD (phase) Just Mother Match Groups 2 gen Groups 3 gen Groups
58 16 29 13 12 3 9
% 28% 50% 22% 25% 75%

Thirteen match just to mother (teal), of which one, on chromosome 12 for 10.27 centiMorgans, is the most likely to be legitimate, or IBD. The rest were smaller segments and none were passed to a the child, so they are less likely to be legitimate, or IBD.

There are a total of 12 matching groups, of which 3 are for only two generations, me and mother. In other words, not all of that DNA got passed on to my child, but at least some of it did 9 of those 12 times.

Does Size Matter?

I wanted to see how the small versus large segments faired in terms of three generations of parental phasing. Are smeller segments legitimate or not?  Do they stand up?  The “Phased cMs by Size” chart above was sorted in chromosome order, with teal being a match to mother only (so we don’t know if it phased), green meaning the segment DID phase and red meaning it DID NOT phase with the parent.

Removing the teal blocks, which match to mother only, meaning we don’t know if they would parentally phase or not, leaves us with the blocks that had the opportunity to phase, and whether they passed or failed. 100% of the blocks 3.57cM and above phased.  A natural dividing line seems to occur about the 3.5 cM level, shown below.

phased cms by size less teal

It’s interesting that all matches above 3.36 cM phased, several of them twice, through three generations or two transmission (inheritance) events. Of those, 9, or 43% were under the 10cM threshold suggested by some, and 7, or 33% were under the 7cM threshold.

Most of the segments 3.36 cM and below, did not pass phasing. Of those, 6 or 26% did pass phasing, while 17, or 74%, did not.  Note that this cM level is with the SNP threshold set to 500 SNPs, which is generally the lowest number I use.

Segment Size # of Segments # Segments Phased %
Larger than 3.5 cM 21 21 100
Smaller than 3.5 cM 23 6 26

Are these results a function of this particular family, or would this hold if more parental generational phasing studies were performed?

Let’s see. 

The Threshold Study

I was surprised by the seemingly low threshold of 3.5 cM that appeared to be the rough dividing line for cMs that passed parental phasing and those that did not. I undertook a small study of four additional 3 generation non-endogamous families.

I’ve included the Lore study that we discussed above in the first column.

I have also removed all duplicates in the results below, since the duplicates were an artifact of matching groups where we had three generations to match.

I completed 4 different three-generation studies in 4 unrelated non-endogamous families and noted the rough threshold for where matches seem to pass or fail phasing – in other words, the fall line. In all 4 examples below, the threshold was between 2.46 and 3.16 cM.  You could move it slightly higher, depending on what criteria you use for the “fall line,” which is why I’ve included the raw data.  In all cases, the SNP threshold was at 500 so you would not see any matches with fewer than 500 SNPs.

The black bar in the results below marks the location where the shift from fail to pass occurs in the various studies.

4 family phasing

Additionally, I have one 4-generation study available as well. The closest related of the 4 generations that were being matched against were first cousins, then first cousins once removed, then first cousins twice removed (equal to 2nd cousins) then 1st cousins three times removed (equal to second cousins once removed).

You can see, below, that the pass/fail threshold for this 4 generation, 3 transmission study was also at 3.69 cM for valid segments that survived. The segments labeled “2 match” mean that they did not get passed to the younger generations, so they only matched in the oldest two generations, 3 match the oldest 3 generations and 4 match meaning the match survived through all 4 generations.

It’s interesting that even some of the smaller segments held through all 4 generations.

4 gen phasing

Ethnicity Matters

Clearly, parental phasing is only successful when you have matches. Of the three data bases available for autosomal DNA comparisons today, Family Tree DNA and 23andMe likely have the largest representation of non-US participants, because the Ancestry.com test was not sold outside the US for quite some time.  The Family Tree DNA Family Finder test was sold in the most locations outside the US.

Family Tree DNA probably has the best representation of Jewish DNA of all of the data bases.

Family Tree DNA projects facilitate the grouping of individuals by self-selected interest which includes ethnic categories, making those relationships visible by virtue of project membership wherein they are not readily evident in other data bases.

Therefore, by virtue of who has tested, if your ancestry is not “US” meaning a melting pot type of environment who are not recent arrivals, then you are likely to have less matches, so less phased matches too.  If you have a high degree of any particular ethnicity, even if your ancestry is “US,” you may still have fewer matches.  For example, 3 of 4 of my mother’s grandparents were either German or Dutch, and she has 710 matches, or roughly half the matches that I have.  My father’s heritage was Appalachian, meaning Colonial American.

Here’s a quick chart showing the total matches as of April, 2016 for a number of individuals who contributed their match totals in Family Finder and who carry either no US heritage or a specific ethnicity.  For purposes of comparison, three individuals with typical mixed colonial US heritage are shown at the top.

Ethnicity match chart

People with high percentages of African heritage tend to have few matches today, as do those of purely European heritage. Unfortunately, not many Africans or African-Americans test their DNA and DNA testing is not as popular in Europe as it is in the US.  Many people in Europe are leary of DNA testing or don’t feel they need to test, because “we’ve always lived here.”   I’m hopeful that the sustained popularity of programs like Who Do You Think You Are and Finding Your Roots will encourage more people of all ethnicities and locations to test from around the globe.

People from highly endogamous populations have a different issue to deal with, as you can see from the very high number of Jewish matches in the chart above. Since these people descend from a common founder population, they share a lot of ancestral DNA that is identical by population, meaning they did receive it from an ancestor, so it’s not IBC, but they received that segment because that particular segment is very prevalent within that population.  Determining which ancestor contributed that piece of DNA is exceedingly difficult, if not impossible because several ancestors carried that same segment.

Therefore, while the segment is identical by descent, it’s probably not genealogically useful in a 100% endogamous scenario.

In an unpublished study, we discovered that while working with parentally phased Jewish results, it’s not unusual for up to half of the matches to not match the participant plus either parent on the same segments. Or conversely, they may match both parents, but the segments are comparatively small.  Matching to both parents in an endogamous population, without a known familial relationship, and without at least one relatively large segment, is an indicator of IBP, identical by population, matches.  For Jewish and other endogamous people, parental phasing is very promising, and will help them sort through irrelevant “diamond in the rough” matches indicated by no parent matches or smaller both parent matches to find the genealogically relevant gems.

In all parental phasing groups studied, no one lost less than 10% of their matches utilizing parental phasing and most people lost significantly more, up to half.  I would very much like to see these same kinds of 3 or 4 generation parental phasing studies done for groups of Jewish, other endogamous and African American families.  In order to do a study of one family, you need at least 3 generations who have tested and another known family member, like a first or second cousin perhaps, to match against.

In Summary

Dual parental phasing works wonderfully.  One parent phasing works pretty well too.  Even close relative phasing works, just not as well as parental phasing.  You can only work with the people you have available to test, so test every relative you can convince!

If you have one or both parents to test, by all means, do. You’ll be able to phase your matches against both of your parents individually and eliminate the majority of IBC matches.

If you have grandparents or their siblings available to test, do, and quickly so you don’t lose the opportunity. Test the oldest person/generation in each line that you can.

If you don’t have both parents, test your half and full siblings, all of them, the more the better, because they inherited parts of your parents DNA that you didn’t.

Find your closest relatives and test them, yes, all of them.

If you are testing parents, you don’t need to test their children too, because their children will only receive half of their parent’s DNA, and you already have the parents DNA.

Even if you can’t phase your matches utilizing your parents DNA, you can use the combination of your matches with other relatively close family members to assign or suggest matches to both sides of your family along family lines – creating match groups. For example, if your match matches you and your great-uncle Charlie on the same segment, then it’s very likely that match is from the common ancestral line shared by your common ancestor with great-uncle Charlie – your great-grandparents.  Triangulation, of course, will prove that.

Some of your relatives will be quite interested in DNA testing and others will be happy to test simply because it helps you, and they like to hear about the result of the genealogy research. I’ve discovered that providing a scholarship for the testing, especially for those people you really want to test, goes a very long way in convincing people that DNA testing for genealogy is something they might be interested in doing.  If you can’t personally afford a scholarship for everyone, try the old fashioned collection jar.  And no, I’m not kidding.  It works wonders and gives everyone an opportunity to participate and invest as well, as much as they can afford.

Ethnicity testing has a lot of sizzle for some folks too – so don’t just deliver the dry facts – be sure to talk about the sizzle too. Sizzle sells!  People get excited about the possibilities and of course, you’ll explain the result to them, so they get to visit with you a second time as well.  Something to look forward to at next summer’s picnic!

Be sure to take swab kits to family events; picnics, reunions, graduation parties, weddings and holiday gatherings. Believe me, I have a DNA kit in my purse or car at all times.  And maybe, if your extended family lives close by, resurrect the old-time Sunday afternoon tradition of “going calling.”  Not only can you collect DNA, you can collect family memories too and I guarantee, you’ll make a new discovery with every visit.  Take this opportunity to interview your relatives.

It’s amazing isn’t it, the things we do for this “DNA phase” that we’re all going through!

Acknowledgements

I want to thank Family Tree DNA for their ongoing support of projects and citizen scientists which makes these types of research studies possible. I also want to thank several individuals in the genetic genealogy community who provided their information and gave permission for me to incorporate their results into this article.  Without sharing and collaboration, these types of efforts would simply not be possible.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Concepts – CentiMorgans, SNPs and Pickin’ Crab

In autosomal DNA testing, you’ll see the terms centiMorgans, represented as cM and SNPs, which stands for single nucleotide polymorphism, combined.

These are two terms that are used to discuss thresholds and measurements of matching amounts of autosomal DNA segments.

These two terms, relative to autosomal DNA, are two parts of a whole, kind of like the left and right hand.

CentiMorgans are units of recombination used to measure genetic distance. You can read a scientific definition here.

For our conceptual purposes, think of centiMorgans as lines on a football field. They represent distance.

football fabric 2

SNPs are locations that are compared to each other to see if mutations have occurred.  Think of them as addresses on a street where an expected value occurs. If values at that address are different, then they don’t match.  If they are the same, then they do match.  For autosomal DNA matching, we look for long runs of SNPs to match between two people to confirm a common ancestor.

Think of SNPs as blades of grass growing between the lines on the football field.  In some areas, especially in my yard, there will be many fewer blades of grass between those lines than there would be on either a well maintained football field, or maybe a manicured golf course.  You can think of the lighter green bands as sparse growth and darker green bands as dense growth.

If the distance between 2 marks on the football field is 5cM and there are 550 blades of grass growing there, you’ll be a match to another person if all of your blades of grass between those 2 lines match if the match threshold was 5cM and 500 SNPs.

So, for purposes of autosomal DNA, the combination of distance, centiMorgans, and the number of SNPs within that distance measurement determines if someone is considered a match to you. In other words, if the match is over the threshold as compared to your DNA, meaning the match is deemed to be relevant by the party setting the threshold.  Think of track and field hurdles.  To get to the end (match), you have to get over all of the hurdles!

hurdles

By Ragnar Singsaas – Exxon Mobil ÅF Golden League Bislett Games 2008, CC BY 2.0, https://commons.wikimedia.org/w/index.php?curid=5288962

For example, a threshold of 7 cM and 700 SNPs means that anyone who matches you OVER BOTH of these thresholds will be displayed as a match.  So centiMorgans and SNPs work together to assure valid matches.

Thresholds

These two numbers, cMs and SNPs, are used in conjunction with each other. Why?  Because the distribution of SNPs within cM boundaries is not uniform.  Some areas of the human genome have concentrations of SNPs and some areas are known as “SNP deserts.”  So distance alone is not the only relevant factor.  How many blades of grass growing between the lines matters.

Each of the vendors selects a default threshold that they feel will give you the best mix of not too many false positives, meaning matches that are identical by chance, and not too many false negatives, meaning people who do actually match you genealogically that are eliminated by small amounts of matching DNA. Unfortunately, there is no line in the sand, so no matter where the vendor sets that threshold, you’re probably going to miss something in either or both directions.  It’s the nature of the beast.

Company Min cMs Min SNPs Comment
Family Tree DNA 7cM for any one segment + 20cM total 500 After the initial match, you can view down to 1cM and 500 SNPs to people you match
23andMe 7cM 700
Ancestry 5cM after Timber and associated phasing routines Unknown Timber population based phasing removes matches they determine to be “too matchy” or population based
GedMatch User selectable – default is 7 User selectable – default is 700

As you might guess, there many opinions about the optimum threshold combinations to use – just about as many opinions as people!

These are important values, because the combined size of those matches to an individual allows you to roughly estimate the relationship range to the person you match.

As a general rule, the vendors do a relatively good job, with some exceptions that I’ve covered elsewhere and amount to beating a dead horse (Ancestry’s Timber, no chromosome browser). Of course, one of the big draws of GedMatch is that you can set your own cM and SNP matching thresholds.

Having said that, if you come from an endogamous population, you may want to raise your threshold to 10cM or even higher, depending on what you’re trying to accomplish

Effectively Using cMs and SNPs

Your personal goals have a lot to do with the thresholds you’ll want to select.

If you are new at genetic genealogy, you will first want to pursue your best matches, meaning the highest number of matching centiMorgans/SNPs, because they will be the low hanging fruit and the easiest matches to connect genealogically. Said another way, you’ll match your closer relatives on bigger chunks of DNA, so concentrate on those first.  Successes are encouraging and rewarding!

Your match to a second cousin, for example, will have a significant amount of shared DNA and second cousins share common great-grandparents – 2 of 8 people in that generation on your tree – so relatively easy to identity – as these things go.

The chart below shows the expected percentage of shared DNA in a given match pair, in this case, first and second cousins with a first cousin once removed thrown in for good measure. Also shown is the expected amount of shared centiMorgans for the given relationship, the average amount of shared DNA from a crowd sourced project titled The Shared cM Project by Blaine Bettinger and the range of shared DNA found in that same project.

A pedigree chart of my family members fitting those categories is shown below, plus the actual amount of shared cMs of DNA to the right.

shared cM table

The chart below shows my DNA matches to my first cousin once removed, Cheryl.

Since we do match at Family Tree DNA above the match threshold, I can view all of my matching segments to Cheryl down to 1cM and 500 SNPs.

Cheryl chart

Just as a matter of interest, I’ve color coded the cM segments:

  • >10 cM = green
  • 7-10 cM = yellow
  • <7 = red

This means that if these were the largest matching segments, you would or would not be able to see them at the various thresholds of 7 and 10 cM.

If the matching threshold is at the default of 7cM, the green and yellow segments would be displayed.

If the matching threshold was set at 10, only the green cM segments are going to be shown.

At Family Tree DNA, you can select various threshold display options when using the chromosome browser tool, but not for initial matching. In other words, you have to match at their default threshold before you can see your smaller segments or alter your threshold display.

Some people want to see all of their DNA that matches, and some only want to see the large and compelling pieces, those green segments.  Neither choice is wrong, simply a matter of personal preference and individual goals.

The “large and compelling” part of that statement brings me back to why you’re participating in genetic genealogy in the first place, those individual goals.  The larger segments are going to lead to common ancestors who are generally easier to find and identify, unless you have an unidentified parent or a misattributed parental event.

You would never start with smaller segments in terms of matching, but that does not mean those smaller segments are never useful.  In fact, after you’ve managed to analyze all of your low hanging fruit, and you’re ready to research or concentrate on those ugly brick walls, groupings of those smaller segments in descendants may just be your lifesaver.

Surviving Phasing

However, now I’m curious. How many of those smaller segments do stand up to the test of parental phasing, meaning they match both me and my parent?  If my match (Cheryl) matches both me and my parent, then Cheryl does not match me by chance on that segment so the match is genealogical in nature, the matching DNA proven to have descended to me from my mother.

Let’s see.

Cheryl Mom me chart

In order to phase my results with Cheryl against my mother, I copied Mother’s results into the same spreadsheet, above, color coding our rows so you can see them easier. “Cheryl matching Mom” rows are apricot and “Cheryl matching me” rows are yellow.

You can see that in some cases, like the first two rows, the two rows are identical which means I inherited all of Mom’s DNA in that segment and Cheryl inherited the same segment from her father, matching both Mom and me.

In other cases, I inherited part of Mom’s DNA on a particular segment.  I could also have inherited none of a particular segment.

In fact, of the 27 segments where I match Mom on any part of the segment, I match her on the entire segment 18 times, or 66.6% and on part of the segment 9 times, or 33.3%.

I left the color coding in the cM column the same as it was before, in my rows, to indicate small, medium and large segments. The small segments are red, which would be the most likely NOT to phase with my mother, in other words, the most likely to be Identical by Chance, not descent.  If Cheryl and I are Identical by Chance on these segments, it means that the reason I’m matching Cheryl is NOT because I inherited that chunk of DNA from mother. If Mom and I both match Cheryl, they Cheryl and I are Identical by Desent, meaning I inherited that piece of DNA from my mother, so the match is not because Cheryl’s DNA is randomly matching that of both of my parents.

In the spreadsheet below, I removed mother’s rows to eliminate clutter, but I color coded mine. The rows that show red in the CHR and SNP columns BOTH are rows that did NOT phase with my mother, meaning these matches were indeed identical to Cheryl by chance.  The rows that are red ONLY in the cM column (and not in the CHR column) are small segments that DID phase with my mother, so those are identical by descent (IBD).

Cheryl Me phased chart

Here’s the interesting part.

  • All of the large segments, 10cM and over passed phasing. They are legitimate IBD matches.
  • One of 2 of the medium cM matches passed phasing.
  • Of the 15 smaller segments, ranging in size from 1.38 cM to 6.14 cM, more than half, 8, passed phasing. Seven did not. The smallest segment to pass phasing was 1.38 cM. I suspect that part of the reason that the smaller cM segments are passing phasing is that the SNP threshold is held steady at 500 SNPs. In another (unpublished) study, dropping the SNP threshold below 500 results in a dramatic increase in matches (roughly fourfold) and a very small percentage of those matches phase with parents.

Small Segments Guidelines

There has been a lot of spirited debate about the usage, or not, of small segments, so I’m going to provide some guidelines.  Let me preface this by saying that none of this is worth getting your knickers in a knot, so please don’t.  If you don’t want to include or utilize small segments, then just don’t.

  • What is and is not a small segment can vary depending on who you are talking to and the context of the conversation.
  • Small segments CAN and do survive parental phasing, as shown above.
  • Small segments CAN be triangulated to a particular ancestor. Triangulated in this sense means that this segment is found in the descendants of a group of people (3 or more) proven to descend from the same ancestor AND who all match each other on the same segment.
  • Not all small segments can be triangulated to a common ancestor.  But then again, the same can be said for larger segments too.  It’s more difficult and unlikely to be successful with smaller segments unless you are starting with a group of people who descend from a common ancestor and are looking for “ancestral DNA.”
  • Small segments, even after triangulation, can be found matching a different lineage. This is an indicator that while the descendants of the first group share this DNA segment from a specific ancestor, it may also be prevalent in a population in general, which would cause the same segment to show up matching in a second lineage from the same region as well. I have an example where my Acadian line also matches a different German line on a particular segment – which really isn’t surprising given the geography and history of Germany and France..
  • Small segments without the benefit of other tools such as parental phasing, triangulation and match groups are, at this time, a waste of time genealogically. This may not always be the case.
  • Never start with small segments.
  • Never draw conclusions from small segments alone, meaning without corroborating evidence.
  • Use small segments only in context of a combination of parental phasing, triangulation and match groups.
  • Just because you match a group of people, out of context, on a segment (small or otherwise) doesn’t mean that you share a common ancestor. The smaller the segment, the more likely it is to be either IBC or IBP. Situations where the DNA is exactly the same from both parents, meaning everyone has all As in that location, for example, are called runs of homozygosity and the smaller the segment, the more likely you are to encounter ROH segments which appear as phased matches.  Yes, another cruel joke of nature.

As a proof point relative to how deceptive small segment matching out of context can be, I ran my kit against my friend who is unquestionably 100% Jewish. I have no Jewish ancestry.  At 7cM/700 SNPs we have no matches, at 3cM/300SNPs we have 7 matching segments.

Me to Jewish match

However, matching this individual to my phased parents, none of these segments match both me and either one of my phased parent. Phased parent kits, at GedMatch are kits reflecting the half of my parents DNA I received from that parent.  If you have one or both parents who have tested, you can create phased kits with instructions from this article.

Lowering the match threshold even further to 100 SNPs and 1cM, my Jewish friend and I match on a whopping 714 tiny matching segments, over 1100 cM total, but all very small pieces of DNA. Because of the absolute known 100% Jewish heritage of my friend, and my known non-Jewish heritage, these matches must be either IBC, identical by chance or perhaps some small segments of IBP, identical by population from a very long time ago when both of our ancestors lived in the Middle East, meaning thousands of years ago.  Bottom line, they are not genealogically relevant to either of us.  I repeated this same experiment with someone that is 100% Asian, with the same type of results.  You will match everyone at this threshold, including ancient DNA matches tens of thousands of years old.

The message here is that you can work from the “top down” with small segments, meaning in a known relationship situation like with my cousin and other relatives, but you cannot work from the bottom up with small segments as you have no way to differentiate the wheat from the chaff.

In the Crumley study, there are groups of small segments (greater than 3cM/300SNPs) that persist in multiple descendants of James Crumley born in 1712.  In this case, because you can separate the wheat from the chaff with more than 50 participants, others who triangulate with those small segments and match the group of Crumley descendants may well share a common ancestor at some point in time, especially if they can phase with their parents on those segments to prove the match is not IBC.

  • Remember, your match on any segment to one person can be IBD meaning you have identified the common ancestor, your match to another person on that same segment IBC, and yet to a third person, IBP where your match survives generational phasing, but you may never find the common ancestor due to the age of the segment or endogamy.
  • When utilizing small segments, I generally don’t drop the SNP threshold below 500, as the number of matches increases exponentially and the valid matches decrease proportionately as well. I’ll be publishing more on this shortly.
  • I do fully believe, within this set of cautionary criteria, that small segments can be useful. I also believe that small segments can be very easily misinterpreted. The use of matching segments has a lot to do with combining different pieces of evidence to build confidence in what the “match” is telling you. I wrote about the Autosomal DNA Matching Confidence Spectrum here.
  • Small segments should only be utilized after one has a good grasp of how genetic genealogy works and by utilizing the tools available to restrict those segments to genealogically descended DNA. In other words, small segments are for the advanced user. However, maintain those small segment groupings and triangulations in your spreadsheet, because when you have the level of experience needed to work with those small segments, they’ll be available for you to work with.  You may discover that most of your DNA triangulates by using large segments and you don’t need to utilize those small segments at all.
  • If you send me a list of matches from GedMatch with the cM set to 1 and the SNPs set to 100 and ask me what I think, I would simply to refer you to this article. But if I did reply, I would tell you that unless you have corroborating evidence, I think you’re wasting your time, but it’s your time and you’re welcome to do what you want with it. Life is about learning.
  • If you tell me you’ve drawn any conclusions from those types of matches (1cM and 100 SNPs), I’m going to be inconvincible without other tools such as genealogical proof,  parental phasing and triangulation groups that prove the segments to be valid to a specific ancestor for the people about whom you’re drawing conclusions. I might even suggest you look at the raw data in those segments to see if you’re dealing with runs of homozygosity.

Netting It Out

The net-net of this is that small segments can be useful, but it takes a lot more work because of the inherent questionable nature of small segment matches. This goes along with that old adage of “extraordinary claims require extraordinary evidence.”  Just be ready to roll up your shirt sleeves, because small segments are a lot more work!

Now having said all of that, I very much encourage continuing to triangulate your small segments and pay attention to them. You may notice patterns very relevant to your own genealogy, or you may learn that those patterns were somewhat deceptive – like IBD that turned into IBP.  Still useful and interesting, but perhaps not as originally intended.

Without continuing and ongoing research, we’ll never learn how to best utilize small segments nor develop the tools and techniques to sort the wheat from the chaff. Just be appropriately paranoid about conclusions based on small segments, especially small segments alone, and the smaller the segment, the more paranoid you should be!

There is a very big difference between working with small segments along with larger matching data and genealogy, which I encourage, and drawing conclusions based on small segment data alone and out of context, which I highly discourage.

Let’s hope that all of your matches come with large segments and matching ancestors in their trees!!!

Pickin’ Crab

You know, working with different cM levels and SNPs, especially as segments get smaller and more challenging, I’m reminded of “picking crab” at a good old North Carolina crab bake. You would never start out with a crab bake for breakfast.  You kind of have to work your way up to pickin’ crab – the same as small segments.  And you never pick crab alone. It’s a group activity, shared with friends and kin.  So is genetic genealogy.

You’ll need lessons, at first, in how to “pick crab” effectively. There’s a particular technique to it.  Friends teach friends.  You’ll find cousins you didn’t know you had, like Dawn in the brown shirt below, giving lessons to Anne.

Dawn lessons

A little practice and you’ll get it.

Just because it’s not easy doesn’t mean it’s not productive, especially when everyone works together!  And the results are “very good,” if you just have patience and work through the process.  If you decide that you “can’t pick crab,” then you’re right, you can’t pick crab, and you’ll just have to go hungry and miss out on all the fun!  Don’t let that happen.  Hint – sometimes the fun is in the pickin’!

Here’s hoping you can solve all of your brick walls with large cMs and large SNP counts, and if not, here’s hoping you enjoy “picking crab” with a group of friends and cousins and who will contribute to the ongoing research.

Pickin’ crab, or working on identifying difficult ancestors is always better when collaborating with others! Find cousins and fellow collaborators and enjoy!!! Genetic genealogy is not something you can do alone – it’s dependent on sharing.

crab pickin

Sometimes it’s as much about the friends and cousins you meet on the journey and the adventures along the way as it is about the answer at the end.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Concepts – Identical by…Descent, State, Population and Chance

In genetic genealogy, what does it mean when someone says they are “identical by” something…and what are those various somethings?

In autosomal DNA, where your DNA on chromosomes 1-22 (and sometimes X) is compared to other people for matches of a size that indicates a genealogical relationship, you can actually match people in different ways, for different reasons.

But first, let’s make one thing perfectly clear. There is only one way to obtain your autosomal DNA – and that’s through your parents, 50% from each parent.  However, how much of their (and your) ancestor’s DNA you receive is not necessarily half of what they received from that ancestor.

If you receive ANY DNA from that ancestor, it MUST BE through your parents. There is no other way to inherit DNA.

Period.

No. Other. Way.

If you would like to read the Concepts article about inheritance and matching, click here. If you don’t understand autosomal DNA inheritance and matching concepts, you won’t be able to understand the rest of this article.

Identical by Descent (IBD)

When you match someone because you share DNA from a common ancestor, that is called Identical by Descent, or IBD. That’s what you want.  That’s a good thing, genealogically speaking.

Let’s take a look at how an IBD segment of DNA works. In the graphic below, the strand location is in the first column.  The next two pink columns are the two strands that your mother carries, one from her Mom and one from her Dad – and the values in each location from each parent.  Columns 4 and 5 are the two blue strands of DNA carried by your Dad, one from his Mom and one from his Dad.  The final two columns are what you inherited from both your mother and your father.  In this case, we made it easy and you simply inherited one of each of their strands entirely.  Yes, that does happen in some cases for a particular chromosome segment, but not all of the time.  Conceptually, for this example, it doesn’t matter.

Identical 1

Your Inheritance

In this example, you inherited strand 1 from your Mom, all As and strand 2 from Dad, all Gs. Your match, shown in the graphic below, matches you on all As, so also matches your mother.  This phenomenon is called parental phasing, which means we know it’s a legitimate match because the person matches both you and one of your parents.

For purposes of this conceptual discussion you must match on all 10 locations for this to be considered a matching segment. So in this case, your matching threshold is “10 locations.”

Identical 2

Your Match Matches You and Your Mother’s DNA – Identical by Descent

Now, understand that while I’ve shown “You” with your strands color coded so you can see who you received which pieces of DNA from – that’s not how your DNA really looks. There is no color coding in nature.  I’ve added color coding to make understanding these concepts easier.

This is how you and your parents DNA really look:

Identical 3

Notice that in your parents, their parent’s strands are mixed back and forth, so you really can’t tell which DNA came from whom.  It’s the same for you too.

What the matching software has to do is to look for a common letter between you and your match.

So, at location 1, you inherited an A and a G from your parents. Your match has an A and a T, so you and your match share a common A.  If you look at all of your matches locations, they share a common A with you on all of those locations.  It just so happens you received that A from your mother – but without your Mom to compare to – you have no way to know which parent that particular DNA value came from.  So, the best matching software can do is to tell you that indeed, you do match – on 10 locations in a row – so this is considered a match and will be reported as such on your match list.

Why you match is another matter altogether.

And, ahem….there is another way to match someone, aside from receiving ancestral DNA from your parents. I know, this is a bad joke isn’t it.  Yes, it is, but it’s real.

So, to summarize, there is no other way to obtain your DNA except 50% from one parent and 50% from the other.

However there are two ways to match someone:

  • Identical by Descent, IBD, meaning you match someone because you share the same DNA segment that you received from an ancestor through a parent, as shown above.
  • Identical by Chance, IBC, meaning that you match someone, but randomly – not by inheritance.  How the heck can that happen?

Let’s look at how that can happen.

Identical by Chance (IBC)

Because you receive a strand of DNA from each of your parents, but that DNA is all intermixed in you, you can possibly match someone else by virtue of the fact that they aren’t actually matching your ancestral DNA segment inherited from an ancestor, but by chance they are matching DNA that bounces back and forth between your parents’ DNA.

Identical 4

Your Match Matches Neither of your Parents’ Strands of DNA – Identical by Chance

In this example, you can see the that you inherited the same strands from your parents as in example 1 above, but your match is now matching you, not on your mother’s strand 1, all As, but on a combination of A from your mother and G from your father. Therefore, they don’t match either of your parents on this segment, because they are matching you by chance and not because you share a strand of DNA that you received from a common ancestor on this segment with your match.

This is easy to discern because while they match you, they won’t match either of your parents on that segment, because the match is not on an ancestral DNA segment, passed down from an ancestor. Using parental phasing, you compare your matches to your parents to see which “side” they fall on.  If they fall on neither parents’ side, then they are IBC or identical by chance.

Identical 5

Identical By Chance Identified Through Parental Phasing

In this example, you can see that you match all of these people. By using parental phasing, you can tell that you are identical by descent (IBD) to everyone except John, who matches neither of your parents, so your match to John is identical by chance (IBC).  We will talk more in an upcoming article about Parental Phasing.

If you don’t have your parents to compare to, and you match multiple people on the same segment, there should be 2 groups of people who all match each other on that segment – one group from your Mom’s side and one from your Dad’s side – even if you can’t identify your common ancestor. If there are people who don’t fit into either of those two groups, because they don’t match those group members, then the misfits are identical by chance.

Even if your parents are unavailable, this is a situation where testing other relatives helps, and the closer the better, because those relatives will also fall into those match groups and will help identify which group is from which side of your family, and which ancestral line.

In the example below, using the same people from the phased parent example above, we no longer have our parents to compare to, but we do have an aunt, Mom’s sister, and an uncle, Dad’s brother. By comparing those who match us to our close relatives – if everyone in the match group matches each other, then we know they are IBD and the come from Mom’s side of the family or Dad’s side of the family.

Identical 6

Identical By Chance Identified Through Close Family Match Groups

In general matching, meaning not on specific segments, just on your match list, if John and I match, but John doesn’t match mother’s sister, it could mean that John matches me on a different segment that my aunt didn’t inherit from my grandparents but that my mother did. So the match could be valid, even though he doesn’t match my aunt.

However, moving to the segment matching level, shown above, we can differentiate, at least for that segment.  This is yet another example of why segment analysis tools are so critically important.

If we only had one matching group, the green above, we would not be able to say that John was IBC on this segment, because John might be matching me on Dad’s side.

But in this case, we have proof points on both sides of this same segment, with two match groups, green from Mom and blue from Dad.  Mom’s side has a match group of 4+me (including her sister) who all match each other on this same segment, indicating that they all descend through my mother’s side of my tree.  On Dad’s side, we have his brother and two other people who match each other and me on those same segments.

Since John matches no one in either match group on either side, his match to me on this segment must be IBC.  You can read more about match groups and confidence here.

Identical by chance segments tend to be smaller segments, because the chances of matching more locations in a row by chance diminish as the number of locations increases.

Ok, so now you’ve got this – the two ways to match. Identical by descent (IBD) and identical by chance (IBC,) nature’s cruel joke.

So, what the heck are identical by state (IBS) and identical by population (IBP).

Good questions.

Identical by State (IBS)

Identical by state is really an archaic term now, but you’ll likely still run into it from time to time. Understand that genetic genealogy is still a really new field of discovery.  Initially, terms weren’t defined very well and have since evolved.  IBD was used to mean a match where you could find a common ancestral line.  IBS, or identical by state, was often used when one could not find the ancestral line.  What this implied was that the match was not genealogical in nature.  But that often wasn’t true.  Just because we can’t determine who the common ancestor is, doesn’t mean that common ancestor doesn’t exist.  After we have more matches, we may well figure out the common ancestor at a later time.

What are some reasons we might not be able to figure out who our common ancestor is?

  • There’s a NPE or undocumented adoption in one line or the other.
  • The pedigree chart of one or both people doesn’t go back far enough in time.
  • The pedigree chart of one or both people is incorrect.
  • Not enough people have tested to connect the dots between the DNA. For example, we may share a common surname, Dodson, but be unable to actually pinpoint which Dodson line/ancestor we share.
  • The match is identical by population (IBP) and not in a genealogical timeframe. We see this most often in highly endogamous populations.
  • The match is identical by chance (IBC) and there is no common ancestor.

The tendency in the past has been to assume that if you can’t find the ancestor, then the problem MUST be that the match is Identical by State. But the problem is that identical by state includes two categories that are mutually exclusive; Identical by Chance and Identical by Population.

Identical by chance means there is no common ancestor, as we illustrated above.

Identical by Population means there IS a common ancestor, and you did receive your DNA from that ancestor, but you may not be able to figure out who it was because it’s too far back in time and many people from that same population base share that DNA segment.

So, today, we don’t say IBS anymore, we say either IBD and if it’s not IBD then it’s either IBC or IBP, but not IBS. If someone says IBS, you need to ask and see if you can determine whether they mean, IBC or IBP, or if they are trying to say something else like “I can’t identify the common ancestor so it must be IBS.”

Identical by Population (IBP)

Identical by population means that a large portion of a population group shares a particular segment of DNA. Some people feel IBP segments are not useful and want all of these segments to be stripped away by population (or academic) based phasing software.

In some cases, if an individual is 100% Jewish, for example, they will have many IBP segments from within the highly endogamous Jewish population. They don’t have any other ancestral DNA segments from ancestors who aren’t Jewish to contrast against in their DNA, so their IBP segments are not useful to them, and are in fact, just in the opposite.  There are too many IBP segments and they are in the way – often referred to as “noise” because they are not genealogically useful, even though they are descended from an ancestor (IBD).  So, yes, IBP is a subset of IBD.

However, for someone who has the following genealogy, these same population based endogamous segments can be extremely useful and informative.

Identical 7

In this conceptual pedigree chart, the Jewish person married a non-Jewish person with deep colonial American ancestry. Their child “Colonial Jew” married someone who was mixed “Irish Asian.”  The person at the bottom, “me,” is not themselves endogamous but has several widely variant lines in their heritage including endogamous lines.

If I’m lucky enough to have an African population segment, that tells me very clearly which genealogical line that match is probably from. But if those IBP segments are removed, they can’t inform me in this situation.

Same with Jewish, or Asian, or Native American.

Let’s see how this might work in real matching.

Let’s say your mother’s A value is only found in African populations, and it’s found in very high proportions in African populations and much less frequently anyplace else in the world, except for where Africans settled.

Identical 8

Identical By Population Example Where Mother’s A Equals African

A few match outcomes are possible:

  1. You match with someone and you can discern a common ancestor or at least an ancestral line because you have only one African genealogical line – an ancestor in your mother’s line, like in the pedigree chart above.
  2. You match with someone and you cannot discern a common ancestor because many or all of your lines are African, similar to the Jewish example.
  3. You match with someone and you identify a common ancestor, but later a second genealogical line matches on that same segment because the segment is so common in the African population. This means you could have received that actual DNA segment from either ancestral line.
  4. Some DNA testing company runs academic or population based phasing software against your DNA and removes that segment entirely because they’ve decided that it occurs too frequently in a population to be useful. In this case, you won’t match that person at all.
  5. Some DNA testing company runs academic or population based phasing software against your DNA and removes that segment entirely because they’ve decided that particular segment in your results is “too matchy” so it must therefore be “invalid” and population based. This is often referred to as a “pile-up” and means that you have proportionally more matches on that segment than you do on other segments. If your “pile-up” segments are removed in this case, again, you won’t match at all. This is exactly what happened to my Acadian matches when Ancestry implemented their Timber phasing software, which removes pile-ups.

The graph below was provided to me at Ancestry DNA Day as an example of my own “pile-up” areas in my genome.

genome pileups

Ancestry with their Timber routine uses population phasing and removes your areas they deem “too matchy”? This helps Jewish and other heavily endogamous people by removing truly population based matches that are spurious and the contributing ancestor impossible to discern.  An endogamous individual could achieve much of the same effect by utilizing a higher matching threshold for their own matches, although that’s not an option at Ancestry.

However, for those of us who are not entirely endogamous, but who may have endogamous lines or lines from different parts of the world, population based phasing removes valuable informational segments and therefore, prevents valuable matches. When Ancestry ran Timber against my results, I lost all but one of my Acadian matches.  Yes, Acadians are heavily endogamous, but in my case, that line accounts for 1 of my 16 great-great-grandparents.  Believe me, if I had a tool to put all of my autosomal matches in one of 16 buckets, I would think it was a wonderful day!!!

16 gggrandparents

Because of endogamy, I actually carried MORE Acadian DNA that I would otherwise carry from a non-endogamous population – so yes, I am very matchy to my Acadian cousins, especially on smaller segments – or I was until Ancestry stripped all of that way.  Thankfully, I still have all of my matches at Family Tree DNA.

Why is endogamous DNA more matchy? Because endogamous populations only have the founders’ DNA and they just keep passing the same founder DNA around and around.

Ironically, another word for this kind of phasing is called “excess IBD” phasing. This means that “someone” decides unilaterally how much matching one “should” have and just chops the rest off at that threshold.  Clearly, that threshold for a fully Jewish person and me would be very different – and one size absolutely does NOT fit all.

I want to show you one more example of what population based phasing does. It chops the heart out of segments that would otherwise match.

People whose parents also test should match their parents on exactly 22 segments, one for each chromosome – because each child is a 100% match to their parents. If there is a read error or two (or three), then let’s say they could have as many as 25 matches, because some chromosomes are chopped in two because of a technical issue.  It occasionally happens.

At Ancestry, we’re seeing 80 to 120 matches for each parent/child pair, which means Timber is removing 58 to roughly 100 legitimate segments that you received from your parent.  One individual reported that they match one parent on 150 different segments, meaning that Ancestry removed 128 segments they decided are “too matchy” but are very clearly ancestral, or IBD, because all of your DNA must match your parents DNA on the strand they gave you.  However because of Timber’s removal of “too matchy” segments, the person no longer matches their parent on that removed segment – or on any of those 58 to 128 removed segments.  And remember, there is only one way to receive your DNA, so all of your DNA must match that of your parents.  You have no invalid matches to your parents DNA.  You can read more here.

Here’s a visual of what IBP phased matching does to you. Recall in our example that you need 10 contiguous matching locations to be considered a match.  I’m showing 20 locations in this example.

Identical 9

Normal Matching – No Population or Academic Phasing

In this first example, the DNA you inherited from your mother is a combination of T and A, where A=African. Notice that only part of what you inherited from your mother is the A this time.

In normal matching without IBP phasing, above, the matching threshold is still 10, but you match your match on a segment that totals 20 locations or units. Now it’s up to you to see if you can identify your common ancestor.

In the IBP phased example, below, your African DNA is removed as a result of population based phasing software. Your African DNA used to be where the red spot with no values is showing in the You 1 column.  Therefore, you still match on the Ts, but you only have a contiguous run of 7 Ts, then the 7 As phasing deleted, then 6 more matching Ts.  The problem is, of course, that instead of a nice matching segment of 20 units, above, you now have no match at all because you don’t have 10 matching locations in a row.  Of course, the same IBP phasing would apply to your mother, so your match would not match your mother either, which means that a valid parentally phased match is not reported.

Identical 10

Population Based Phased Matching Example Removing African

What’s worse, you’ll never have that opportunity to see if you can find your common ancestor, because you and your match will never be reported as a match. This is a lost opportunity.  In the first “normal matching” example, you may never BE able to find that common ancestor, but you have the opportunity to try.  In the second IBP phased matching example, you certainly won’t ever find your common ancestor because you’re not shown as a match.  When population based or academic phasing is involved, you’ll never know what you are missing.

This chopping phenomenon is not a rare occurrence with population based phasing. In fact, if you divide 100 removed segments by 22 chromosomes, there are approximately 4 artificial “chops” taken out of every one of your 22 chromosomes with each parent at Ancestry, and in some cases, more.  The person who now matches their parent on 150 segments has an average of 5.8 artifical phasing induced chops in each chromosome.  When Ancestry implemented Timber, many people lost between 80% and 90% of their total matches.  Mine went from 13,100 to 3,350, a loss of about 75%.  At least some of those were valid and we had identified common ancestral lines.

So, identical by population (IBP) doesn’t necessarily mean bad, unless you’re entirely endogamous. If you’re entirely endogamous, then IBP means challenging and can generally be overcome by looking at larger matching segments, which are less likely to be either IBP or IBC.

Identical by population can be very useful in someone not entirely endogamous in that it preserves ancestral DNA in a given population. In people who carry a combination of different endogamous lines, such as Jewish and Acadian, this phenomenon can actually be very useful, because it increases your chances of matching other individuals from that ancestral line – and being able to assign them appropriately.

Identical by What?

So, in summary, you are either identical because you received DNA from a common ancestor (IBD) or identical by chance (IBC) because nature is playing a mean joke on you and you match, literally, by chance because your match’s DNA is zigzagging back and forth between your parents’ DNA.  And by the way, you can match someone IBD on one segment and the same person IBC or IBP on others.

If you match someone but that person does not also match either of your parents, then it’s an IBC, identical by chance, match. Measuring a match against both yourself and your parents to determine if the match is IBC or IBD is called parental phasing.  We will have a Concepts article shortly about Parental Phasing, so stay tuned.

If you don’t have parents to match against, your matches on any segment should cleanly cluster into two matching groups where you match them and your matches also match each other on that same segment. One group for your mother’s side and one group for your father’s side.  Those who match you but don’t fall into one group or the other are identical by chance, like John in our example.  Of course, you won’t be able to sort these out until you have several matches on that segment.  This is also why testing all available upstream family members is so useful.

If you’re not IBC, you’re IBD meaning that you and your match received that DNA segment from a common ancestor, whether or not you can identify that ancestor.

Identical by population (IBP) is a type or subset of identical by descent (IBD) where many people from that same population group carry the same DNA segment. This is seen in its most pronounced fashion in heavily endogamous populations such as Ashkenazi Jews.

If you are from a highly endogamous population, you will have many IBP matches, generally on smaller segments that have been chopped up over time, and you will want to use a higher matching threshold, perhaps up to 10cM, for genealogical matching, or higher.

If you have endogamous lines in your tree, but are not entirely endogamous, IBP segments may actually be beneficial because you may be able to attribute matches to a specific line, even if not the specific ancestor in that line.

The smaller the segment, the more likely it is to be less useful to you, whether IBD or IBP – but that isn’t to say all small segments should be disregarded because they are assumed to be either IBC or not useful. That’s not the case.  Some are IBD and all IBD segments have the potential to be very useful.  Kitty Cooper just recently reported another wonderful success story using a 6cM triangulated segment.

If you’re highly endogamous, or only looking only for the low hanging fruit, which is more likely to be immediately rewarding, then work with only larger segment matches. They are less likely to be IBC or IBP and more likely to yield results more quickly.  I always begin with the largest matching segments, because not only are they easier to assign to an ancestor, but those matching people may also have smaller matching segments that I can tentatively (pending triangulation) attribute to that specific ancestor as well.

Here’s a handy-dandy cheat sheet if you’re having trouble remembering “Identical by What.”

Identical by Chart

Understand that working with genetic genealogy and autosomal DNA is much like panning for gold. You may get lucky and find a large nugget or two smiling at you from on top the pile, but the majority of your rewards will be as a result of hard work sifting and panning and accumulating those small golden flakes that aren’t immediately obvious and useful.  Cumulatively, they may well hold your family secrets and the keys to locks long ago frozen shut.

Here’s hoping all your matches are IBD!!!!!

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research