Hit a Genetic Genealogy Home Run Using Your Double-Sided Two-Faced Chromosomes While Avoiding Imposters

Do you want to hit a home run with your DNA test, but find yourself a mite bewildered?

Yep, those matches can be somewhat confusing – especially if you don’t understand what’s going on. Do you have a nagging feeling that you might be missing something?

I’m going to explain chromosome matching, and its big sister, triangulation, step by step to remove any confusion, to help you sort through your matches and avoid imposters.

This article is one of the most challenging I’ve ever written – in part because it’s a concept that I’m so familiar with but can be, and is, misinterpreted so easily. I see mistakes and confusion daily, which means that resulting conclusions stand a good chance of being wrong.

I’ve tried to simplify these concepts by giving you easy-to-use memory tools.

There are three key phrases to remember, as memory-joggers when you work through your matches using a chromosome browser: double-sided, two faces and imposter. While these are “cute,” they are also quite useful.

When you’re having a confusing moment, think back to these memory-jogging key words and walk yourself through your matches using these steps.

These three concepts are the foundation of understanding your matches, accurately, as they pertain to your genealogy. Please feel free to share, link or forward this article to your friends and especially your family members (including distant cousins) who work with genetic genealogy. 

Now, it’s time to enjoy your double-sided, two-faced chromosomes and avoid those imposters:)

Are you ready? Grab a nice cup of coffee or tea and learn how to hit home runs!

Double-Sided – Yes, Really

Your chromosomes really are double sided, and two-faced too – and that’s a good thing!

However, it’s initially confusing because when we view our matches in a chromosome browser, it looks like we only have one “bar” or chromosome and our matches from both our maternal and paternal sides are both shown on our one single bar.

How can this be? We all have two copies of chromosome 1, one from each parent.

Chromosome 1 match.png

This is my chromosome 1, with my match showing in blue when compared to my chromosome, in gray, as the background.

However, I don’t know if this blue person matches me on my mother’s or father’s chromosome 1, both of which I inherited. It could be either. Or neither – meaning the dreaded imposter – especially that small blue piece at left.

What you’re seeing above is in essence both “sides” of my chromosome number 1, blended together, in one bar. That’s what I mean by double-sided.

There’s no way to tell which side or match is maternal and which is paternal without additional information – and misunderstanding leads to misinterpreting results.

Let’s straighten this out and talk about what matches do and don’t mean – and why they can be perplexing. Oh, and how to discover those imposters!

Your Three Matches

Let’s say you have three matches.

At Family Tree DNA, the example chromosome browser I’m using, or at any vendor with a chromosome browser, you select your matches which are viewed against your chromosomes. Your chromosomes are always the background, meaning in this case, the grey background.

Chromosome 1-4.png

  • This is NOT three copies each of your chromosomes 1, 2, 3 and 4.
  • This is NOT displaying your maternal and paternal copies of each chromosome pictured.
  • We CANNOT tell anything from this image alone relative to maternal and paternal side matches.
  • This IS showing three individual people matching you on your chromosome 1 and the same three people matching you in the same order on every chromosome in the picture.

Let’s look at what this means and why we want to utilize a chromosome browser.

I selected three matches that I know are not all related through the same parent so I can demonstrate how confusing matches can be sorted out. Throughout this article, I’ve tried to explain each concept in at least two ways.

Please note that I’m using only chromsomes 1-4 as examples, not because they are any more, or less, important than the other chromosomes, but because showing all 22 would not add any benefit to the discussion. The X chromosome has a separate inheritance path and I wrote about that here.

Let’s start with a basic question.

Why Would I Want to Use a Chromosome Browser?

Genealogists view matches on chromosome browsers because:

  • We want to see where our matches match us on our chromosomes
  • We’d like to identify our common ancestor with our match
  • We want to assign a matching segment to a specific ancestor or ancestral line, which confirmed those ancestors as ours
  • When multiple people match us on the same location on the chromosome browser, that’s a hint telling us that we need to scrutinize those matches more closely to determine if those people match us on our maternal or paternal side which is the first step in assigning that segment to an ancestor

Once we accurately assign a segment to an ancestor, when anyone else matches us (and those other people) on that same segment, we know which ancestral line they match through – which is a great head start in terms of identifying our common ancestor with our new match.

That’s a genetic genealogy home run!

Home Runs 

There are four bases in a genetic genealogy home run.

  1. Determine whether you actually match someone on the same segment
  2. Which is the first step in determining that you match a group of people on the same segment
  3. And that you descend from a common ancestor
  4. The fourth step, or the home run, is to determine which ancestor you have in common, assigning that segment to that ancestor

If you can’t see segment information, you can’t use a chromosome browser and you can’t confirm the match on that segment, nor can you assign that segment to a particular ancestor, or ancestral couple.

The entire purpose of genealogy is to identify and confirm ancestors. Genetic genealogy confirms the paper trail and breaks down even more brick walls.

But before you can do that, you have to understand what matches mean and how to use them.

The first step is to understand that our chromosomes are double-sided and you can’ t see both of your chromosomes at once!

Double Sided – You Can’t See Both of Your Chromosomes at Once

The confusing part of the chromosome browser is that it can only “see” your two chromosomes blended as one. They are both there, but you just can’t see them separately.

Here’s the important concept:

You have 2 copies of chromosomes 1 through 22 – one copy that you received from your mother and one from your father, but you can’t “see” them separately.

When your DNA is sequenced, your DNA from your parents’ chromosomes emerges as if it has been through a blender. Your mother’s chromosome 1 and your father’s chromosome 1 are blended together. That means that without additional information, the vendor can’t tell which matches are from your father’s side and which are from your mother’s side – and neither can you.

All the vendor can tell is that someone matches you on the blended version of your parents. This isn’t a negative reflection on the vendors, it’s just how the science works.

Chromosome 1.png

Applying this to chromosome 1, above, means that each segment from each person, the blue person, the red person and the teal person might match you on either one of your chromosomes – the paternal chromosome or the maternal chromosome – but because the DNA of your mother and father are blended – there’s no way without additional information to sort your chromosome 1 into a maternal and paternal “side.”

Hence, you’re viewing “one” copy of your combined chromosomes above, but it’s actually “two-sided” with both maternal and paternal matches displayed in the chromosome browser.

Parent-Child Matches

Let’s explain this another way.

Chromosome parent.png

The example above shows one of my parents matching me. Don’t be deceived by the color blue which is selected randomly. It could be either parent. We don’t know.

You can see that I match my parent on the entire length of chromosome 1, but there is no way for me to tell if I’m looking at my mother’s match or my father’s match, because both of my parents (and my children) will match me on exactly the same locations (all of them) on my chromosome 1.

Chromosome parent child.png

In fact, here is a combination of my children and my parents matching me on my chromosome 1.

To sort out who is matching on paternal and maternal chromosomes, or the double sides, I need more information. Let’s look at how inheritance works.

Stay with me!

Inheritance Example

Let’s take a look at how inheritance works visually, using an example segment on chromosome 1.

Chromosome inheritance.png

In the example above:

  • The first column shows addresses 1-10 on chromosome 1. In this illustration, we are only looking at positions, chromosome locations or addresses 1-10, but real chromosomes have tens of thousands of addresses. Think of your chromosome as a street with the same house numbers on both sides. One side is Mom’s and one side is Dad’s, but you can’t tell which is which by looking at the house numbers because the house numbers are identical on both sides of the street.
  • The DNA pieces, or nucleotides (T, A, C or G,) that you received from your Mom are shown in the column labeled Mom #1, meaning we’re looking at your mother’s pink chromosome #1 at addresses 1-10. In our example she has all As that live on her side of the street at addresses 1-10.
  • The DNA pieces that you received from your Dad are shown in the blue column and are all Cs living on his side of the street in locations 1-10.

In other words, the values that live in the Mom and Dad locations on your chromosome streets are different. Two different faces.

However, all that the laboratory equipment can see is that there are two values at address 1, A and C, in no particular order. The lab can’t tell which nucleotide came from which parent or which side of the street they live on.

The DNA sequencer knows that it found two values at each address, meaning that there are two DNA strands, but the output is jumbled, as shown in the First and Second read columns. The machine knows that you have an A and C at the first address, and a C and A at the second address, but it can’t put the sequence of all As together and the sequence of all Cs together. What the sequencer sees is entirely unordered.

This happens because your maternal and paternal DNA is mixed together during the extraction process.

Chromosome actual

Click to enlarge image.

Looking at the portion of chromosome 1 where the blue and teal people both match you – your actual blended values are shown overlayed on that segment, above. We don’t know why the blue and the teal people are matching you. They could be matching because they have all As (maternal), all Cs (paternal) or some combination of As and Cs (a false positive match that is identical by chance.)

There are only two ways to reassemble your nucleotides (T, A, C, and G) in order and then to identify the sides as maternal and paternal – phasing and matching.

As you read this next section, it does NOT mean that you must have a parent for a chromosome browser to be useful – but it does mean you need to understand these concepts.

There are two types of phasing.

Parental Phasing

  • Parental Phasing is when your DNA is compared against that of one or both parents and sorted based on that comparison.

Chromosome inheritance actual.png

Parental phasing requires that at least one parent’s DNA is available, has been sequenced and is available for matching.

In our example, Dad’s first 10 locations (that you inherited) on chromosome 1 are shown, at left, with your two values shown as the first and second reads. One of your read values came from your father and the other one came from your mother. In this case, the Cs came from your father. (I’m using A and C as examples, but the values could just as easily be T or G or any combination.)

When parental phasing occurs, the DNA of one of your parents is compared to yours. In this case, your Dad gave you a C in locations 1-10.

Now, the vendor can look at your DNA and assign your DNA to one parent or the other. There can be some complicating factors, like if both your parents have the same nucleotides, but let’s keep our example simple.

In our example above, you can see that I’ve colored portions of the first and second strands blue to represent that the C value at that address can be assigned through parental phasing to your father.

Conversely, because your mother’s DNA is NOT available in our example, we can’t compare your DNA to hers, but all is not lost. Because we know which nucleotides came from your father, the remaining nucleotides had to come from your mother. Hence, the As remain after the Cs are assigned to your father and belong to your mother. These remaining nucleotides can logically be recombined into your mother’s DNA – because we’ve subtracted Dad’s DNA.

I’ve reassembled Mom, in pink, at right.

Statistical/Academic Phasing

  • A second type of phasing uses something referred to as statistical or academic phasing.

Statistical phasing is less successful because it uses statistical calculations based on reference populations. In other words, it uses a “most likely” scenario.

By studying reference populations, we know scientifically that, generally, for our example addresses 1-10, we either see all As or all Cs grouped together.

Based on this knowledge, the Cs can then logically be grouped together on one “side” and As grouped together on the other “side,” but we still have no way to know which side is maternal or paternal for you. We only know that normally, in a specific population, we see all As or all Cs. After assigning strings or groups of nucleotides together, the algorithm then attempts to see which groups are found together, thereby assigning genetic “sides.” Assigning the wrong groups to the wrong side sometimes happens using statistical phasing and is called strand swap.

Once the DNA is assigned to physical “sides” without a parent or matching, we still can’t identify which side is paternal and which is maternal for you.

Statistical or academic phasing isn’t always accurate, in part because of the differences found in various reference populations and resulting admixture. Sometimes segments don’t match well with any population. As more people test and more reference populations become available, statistical/academic phasing improves. 23andMe uses academic phasing for ethnicity, resulting in a strand swap error for me. Ancestry uses academic phasing before matching.

By comparison to statistical or academic phasing, parental phasing with either or both parents is highly accurate which is why we test our parents and grandparents whenever possible. Even if the vendor doesn’t use our parents’ results, we certainly can!

If someone matches you and your parent too, you know that match is from that parent’s side of your tree.

Matching

The second methodology to sort your DNA into maternal and paternal sides is matching, either with or without your parents.

Matching to multiple known relatives on specific segments assigns those segments of your DNA to the common ancestor of those individuals.

In other words, when I match my first cousin, and our genealogy indicates that we share grandparents – assuming we match on the appropriate amount of DNA for the expected relationship – that match goes a long way to confirming our common ancestor(s).

The closer the relationship, the more comfortable we can be with the confirmation. For example, if you match someone at a parental level, they must be either your biological mother, father or child.

While parent, sibling and close relationships are relatively obvious, more distant relationships are not and can occur though unknown or multiple ancestors. In those cases, we need multiple matches through different children of that ancestor to reasonably confirm ancestral descent.

Ok, but how do we do that? Let’s start with some basics that can be confusing.

What are we really seeing when we look at a chromosome browser?

The Grey/Opaque Background is Your Chromosome

It’s important to realize that you will see as many images of your chromosome(s) as people you have selected to match against.

This means that if you’ve selected 3 people to match against your chromosomes, then you’ll see three images of your chromosome 1, three images of your chromosome 2, three images of your chromosome 3, three images of your chromosome 4, and so forth.

Remember, chromosomes are double-sided, so you don’t know whether these are maternal or paternal matches (or imposters.)

In the illustration below, I’ve selected three people to match against my chromosomes in the chromosome browser. One person is shown as a blue match, one as a red match, and one as a teal match. Where these three people match me on each chromosome is shown by the colored segments on the three separate images.

Chromosome 1.png

My chromosome 1 is shown above. These images are simply three people matching to my chromosome 1, stacked on top of each other, like cordwood.

The first image is for the blue person. The second image is for the red person. The third image is for the teal person.

If I selected another person, they would be assigned a different color (by the system) and a fourth stacked image would occur.

These stacked images of your chromosomes are NOT inherently maternal or paternal.

In other words, the blue person could match me maternally and the red person paternally, or any combination of maternal and paternal. Colors are not relevant – in other words colors are system assigned randomly.

Notice that portions of the blue and teal matches overlap at some of the same locations/addresses, which is immediately visible when using a chromosome browser. These areas of common matching are of particular interest.

Let’s look closer at how chromosome browser matching works.

What about those colorful bars?

Chromosome Browser Matching

When you look at your chromosome browser matches, you may see colored bars on several chromosomes. In the display for each chromosome, the same color will always be shown in the same order. Most people, unless very close relatives, won’t match you on every chromosome.

Below, we’re looking at three individuals matching on my chromosomes 1, 2, 3 and 4.

Chromosome browser.png

The blue person will be shown in location A on every chromosome at the top. You can see that the blue person does not match me on chromosome 2 but does match me on chromosomes 1, 3 and 4.

The red person will always be shown in the second position, B, on each chromosome. The red person does not match me on chromosomes 2 or 4.

The aqua person will always be shown in position C on each chromosome. The aqua person matches me on at least a small segment of chromosomes 1-4.

When you close the browser and select different people to match, the colors will change and the stacking order perhaps, but each person selected will always be consistently displayed in the same position on all of your chromosomes each time you view.

The Same Address – Stacked Matches

In the example above, we can see that several locations show stacked segments in the same location on the browser.

Chromosome browser locations.png

This means that on chromosome 1, the blue and green person both match me on at least part of the same addresses – the areas that overlap fully. Remember, we don’t know if that means the maternal side or the paternal side of the street. Each match could match on the same or different sides.

Said another way, blue could be maternal and teal could be paternal (or vice versa,) or both could be maternal or paternal. One or the other or both could be imposters, although with large segments that’s very unlikely.

On chromosome 4, blue and teal both match me on two common locations, but the teal person extends beyond the length of the matching blue segments.

Chromosome 3 is different because all three people match me at the same address. Even though the red and teal matching segments are longer, the shared portion of the segment between all three people, the length of the blue segment, is significant.

The fact that the stacked matches are in the same places on the chromosomes, directly above/below each other, DOES NOT mean the matches also match each other.

The only way to know whether these matches are both on one side of my tree is whether or not they match each other. Do they look the same or different? One face or two? We can’t tell from this view alone.

We need to evaluate!

Two Faces – Matching Can be Deceptive!

What do these matches mean? Let’s ask and answer a few questions.

  • Does a stacked match mean that one of these people match on my mother’s side and one on my father’s side?

They might, but stacked matches don’t MEAN that.

If one match is maternal, and one is paternal, they still appear at the same location on your chromosome browser because Mom and Dad each have a side of the street, meaning a chromosome that you inherited.

Remember in our example that even though they have the same street address, Dad has blue Cs and Mom has pink As living at that location. In other words, their faces look different. So unless Mom and Dad have the same DNA on that entire segment of addresses, 1-10, Mom and Dad won’t match each other.

Therefore, my maternal and paternal matches won’t match each other either on that segment either, unless:

  1. They are related to me through both of my parents and on that specific location.
  2. My mother and father are related to each other and their DNA is the same on that segment.
  3. There is significant endogamy that causes my parents to share DNA segments from their more distant ancestors, even though they are not related in the past few generations.
  4. The segments are small (segments less than 7cM are false matches roughly 50% of the time) and therefore the match is simply identical by chance. I wrote about that here. The chart showing valid cM match percentages is shown here, but to summarize, 7-8 cMs are valid roughly 46% of the time, 8-9 cM roughly 66%, 9-10 cM roughly 91%, 10-11 cM roughly 95, but 100 is not reached until about 20 cM and I have seen a few exceptions above that, especially when imputation is involved.

Chromosome inheritance match.png

In this inheritance example, we see that pink Match #1 is from Mom’s side and matches the DNA I inherited from pink Mom. Blue Match #2 is from Dad’s side and matches the DNA I inherited from blue Dad. But as you can see, Match #1 and Match #2 do not match each other.

Therefore, the address is only half the story (double-sided.)

What lives at the address is the other half. Mom and Dad have two separate faces!

Chromosome actual overlay

Click to enlarge image

Looking at our example of what our DNA in parental order really looks like on chromosome 1, we see that the blue person actually matches on my maternal side with all As, and the teal person on the paternal side with all Cs.

  • Does a stacked match on the chromosome browser mean that two people match each other?

Sometimes it happens, but not necessarily, as shown in our example above. The blue and teal person would not match each other. Remember, addresses (the street is double-sided) but the nucleotides that live at that address tell the real story. Think two different looking faces, Mom’s and Dad’s, peering out those windows.

If stacked matches match each other too – then they match me on the same parental side. If they don’t match each other, don’t be deceived just because they live at the same address. Remember – Mom’s and Dad’s two faces look different.

For example, if both the blue and teal person match me maternally, with all As, they would also match each other. The addresses match and the values that live at the address match too. They look exactly the same – so they both match me on either my maternal or paternal side – but it’s up to me to figure out which is which using genealogy.

Chromosome actual maternal.png

Click to enlarge image

When my matches do match each other on this segment, plus match me of course, it’s called triangulation.

Triangulation – Think of 3

If my two matches match each other on this segment, in addition to me, it’s called triangulation which is genealogically significant, assuming:

  1. That the triangulated people are not closely related. Triangulation with two siblings, for example, isn’t terribly significant because the common ancestor is only their parents. Same situation with a child and a parent.
  2. The triangulated segments are not small. Triangulation, like matching, on small segments can happen by chance.
  3. Enough people triangulate on the same segment that descends from a common ancestor to confirm the validity of the common ancestor’s identity, also confirming that the match is identical by descent, not identical by chance.

Chromosome inheritance triangulation.png

The key to determining whether my two matches both match me on my maternal side (above) or paternal side is whether they also match each other.

If so, assuming all three of the conditions above are true, we triangulate.

Next, let’s look at a three-person match on the same segment and how to determine if they triangulate.

Three Way Matching and Identifying Imposters

Chromosome 3 in our example is slightly different, because all three people match me on at least a portion of that segment, meaning at the same address. The red and teal segments line up directly under the blue segment – so the portion that I can potentially match identically to all 3 people is the length of the blue segment. It’s easy to get excited, but don’t get excited quite yet.

Chromosome 3 way match.png

Given that three people match me on the same street address/location, one of the following three situations must be true:

  • Situation 1- All three people match each other in addition to me, on that same segment, which means that all three of them match me on either the maternal or paternal side. This confirms that we are related on the same side, but not how or which side.

Chromosome paternal.png

In order to determine which side, maternal or paternal, I need to look at their and my genealogy. The blue arrows in these examples mean that I’ve determined these matches to all be on my father’s side utilizing a combination of genealogy plus DNA matching. If your parent is alive, this part is easy. If not, you’ll need to utilize common matching and/or triangulation with known relatives.

  • Situation 2 – Of these three people, Cheryl, the blue bar on top, matches me but does not match the other two. Charlene and David, the red and teal, match each other, plus me, but not Cheryl.

Chromosome maternal paternal.png

This means that at least either my maternal or paternal side is represented, given that Charlene and David also match each other. Until I can look at the identity of who matches, or their genealogy, I can’t tell which person or people descend from which side.

In this case, I’ve determined that Cheryl, my first cousin, with the pink arrow matches me on Mom’s side and Charlene and David, with the blue arrows, match me on Dad’s side. So both my maternal and paternal sides are represented – my maternal side with the pink arrow as well as my father’s side with the blue arrows.

If Cheryl was a more distant match, I would need additional triangulated matches to family members to confirm her match as legitimate and not a false positive or identical by chance.

  • Situation 3 – Of the three people, all three match me at the same addresses, but none of the three people match each other. How is this even possible?

Chromosome identical by chance.png

This situation seems very counter-intuitive since I have only 2 chromosomes, one from Mom and one from Dad – 2 sidesof the street. It is confusing until you realize that one match (Cheryl and me, pink arrow) would be maternal, one would be paternal (Charlene and me, blue arrow) and the third (David and me, red arrows) would have DNA that bounces back and forth between my maternal and paternal sides, meaning the match with David is identical by chance (IBC.)

This means the third person, David, would match me, but not the people that are actually maternal and paternal matches. Let’s take a look at how this works

Chromosome maternal paternal IBC.png

The addresses are the same, but the values that live at the addresses are not in this third scenario.

Maternal pink Match #1 is Cheryl, paternal blue Match #2 is Charlene.

In this example, Match #3, David, matches me because he has pink and blue at the same addresses that Mom and Dad have pink and blue, but he doesn’t have all pink (Mom) nor all blue (Dad), so he does NOT match either Cheryl or Charlene. This means that he is not a valid genealogical match – but is instead what is known as a false positive – identical by chance, not by descent. In essence, a wily genetic imposter waiting to fool unwary genealogists!

In his case, David is literally “two-faced” with parts of both values that live in the maternal house and the paternal house at those addresses. He is a “two-faced imposter” because he has elements of both but isn’t either maternal or paternal.

This is the perfect example of why matching and triangulating to known and confirmed family members is critical.

All three people, Cheryl, Charlene and David match me (double sided chromosomes), but none of them match each other (two legitimate faces – one from each parent’s side plus one imposter that doesn’t match either the legitimate maternal or paternal relatives on that segment.)

Remember Three Things

  1. Double-Sided – Mom and Dad both have the same addresses on both sides of each chromosome street.
  2. Two Legitimate Faces – The DNA values, nucleotides, will have a unique pattern for both your Mom and Dad (unless they are endogamous or related) and therefore, there are two legitimate matching patterns on each chromsome – one for Mom and one for Dad. Two legitimate and different faces peering out of the houses on Mom’s side and Dad’s side of the street.
  3. Two-Faced Imposters – those identical by chance matches which zig-zag back and forth between Mom and Dad’s DNA at any given address (segment), don’t match confirmed maternal and paternal relatives on the same segment, and are confusing imposters.

Are you ready to hit your home run?

What’s Next?

Now that we understand how matching and triangulation works and why, let’s put this to work at the vendors. Join me for my article in a few days, Triangulation in Action at Family Tree DNA, MyHeritage, 23andMe and GedMatch.

We will step through how triangulation works at each vendor. You’ll have matches at each vendor that you don’ t have elsewhere. If you haven’t transferred your DNA file yet, you still have time with the step by step instructions below:

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Native American & Minority Ancestors Identified Using DNAPainter Plus Ethnicity Segments

Ethnicity is always a ticklish subject. On one hand we say to be leery of ethnicity estimates, but on the other hand, we all want to know who our ancestors were and where they came from. Many people hope to prove or disprove specific theories or stories about distant ancestors.

Reasons to be cautious about ethnicity estimates include:

  • Within continents, like Europe, it’s very difficult to discern ethnicity at the “country” level because of thousands of years of migration across regions where borders exist today. Ethnicity estimates within Europe can be significantly different than known and proven genealogy.
  • “Countries,” in Europe, political constructs, are the same size as many states in the US – and differentiation between those populations is almost impossible to accurately discern. Think of trying to figure out the difference between the populations of Indiana and Illinois, for example. Yet we want to be able to tell the difference between ancestors that came from France and Germany, for example.

Ethnicity states over Europe

  • All small amounts of ethnicity, even at the continental level, under 2-5%, can be noise and might be incorrect. That’s particularly true of trace amounts, 1% or less. However, that’s not always the case – which is why companies provide those small percentages. When hunting ancestors in the distant past, that small amount of ethnicity may be the only clue we have as to where they reside at detectable levels in our genome.

Noise in this case is defined as:

  • A statistical anomaly
  • A chance combination of your DNA from both parents that matches a reference population
  • Issues with the reference population itself, specifically admixture
  • Perhaps combinations of the above

You can read about the challenges with ethnicity here and here.

On the Other Hand

Having restated the appropriate caveats, on the other hand, we can utilize legitimate segments of our DNA to identify where our ancestors came from – at the continental level.

I’m actually specifically referring to Native American admixture which is the example I’ll be using, but this process applies equally as well to other minority or continental level admixture as well. Minority, in this sense means minority ethnicity to you.

Native American ethnicity shows distinctly differently from African and European. Sometimes some segments of DNA that we inherit from Native American ancestors are reported as Asian, specifically Siberian, Northern or Eastern Asian.

Remember that the Native American people arrived as a small group via Beringia, a now flooded land bridge that once connected Siberia with Alaska.

beringia map

By Erika Tamm et al – Tamm E, Kivisild T, Reidla M, Metspalu M, Smith DG, et al. (2007) Beringian Standstill and Spread of Native American Founders. PLoS ONE 2(9): e829. doi:10.1371/journal.pone.0000829. Also available from PubMed Central., CC BY 2.5, https://commons.wikimedia.org/w/index.php?curid=16975303

After that time, the Native American/First Nations peoples were isolated from Asia, for the most part, and entirely from Europe until European exploration resulted in the beginning of sustained European settlement, and admixture beginning in the late 1400s and 1500s in the Americas.

Family Inheritance

Testing multiple family members is extremely useful when working with your own personal minority heritage. This approach assumes that you’d like to identify your matches that share that genetic heritage because they share the same minority DNA that you do. Of course, that means you two share the same ancestor at some time in the past. Their genealogy, or your combined information, may hold the clue to identifying your ancestor.

In my family, my daughter has Native American segments that she inherited from me that I inherited from my mother.

Finding the same segment identified as Native American in several successive generations eliminates the possibility that the chance combination of DNA from your father and mother is “appearing” as Native, when it isn’t.

We can use segment information to our benefit, especially if we don’t know exactly who contributed that DNA – meaning which ancestor.

We need to find a way to utilize those Native or other minority segments genealogically.

23andMe

Today, the only DNA testing vendor that provides consumers with a segment identification of our ethnicity predictions is 23andMe.

If you have tested at 23andMe, sign in and click on Ancestry on the top tab, then select Ancestry Composition.

Minority ethnicity ancestry composition.png

Scroll down until you see your painted chromosomes.

Minority ethnicity chromosome painting.png

By clicking on the region at left that you want to see, the rest of the regions are greyed out and only that region is displayed on your chromosomes, at right.

Minority ethnicity Native.png

According to 23andMe, I have two Native segments, one each on chromosomes 1 and 2. They show these segments on opposite chromosomes, meaning one (the top for example) would be maternal or paternal, and the bottom one would be the opposite. But 23andMe apparently could not tell for sure because neither my mother nor father have tested there. This placement also turned out to be incorrect. The above image was my initial V3 test at 23andMe. My later V4 results were different.

Versions May Differ

Please note that your ethnicity predictions may be different based on which test you took which is dictated by when you took the test. The image above is my V3 test that was in use at 23andMe between 2010 and November 2013, and the image below is my V4 test in use between November 2013 and August 2017.

23andMe apparently does not correct original errors involving what is known as “strand swap” where the maternal and paternal segments are inverted during analysis. My V4 test results are shown below, where the strands are correctly portrayed.

Minority ethnicity Native V4.png

Note that both Native segments are now on the lower chromosome “side” of the pair and the position on the chromosome 1 segment has shifted visually.

Minority ethnicity sides.png

I have not tested at 23andMe on the current V5 GSA chip, in use since August 9, 2017, but perhaps I should. The results might be different yet, with the concept being that each version offers an improvement over earlier versions as science advances.

If your parents have tested, 23andMe makes adjustments to your ethnicity estimates accordingly.

Although my mother can’t test at 23andMe, I happen to already know that these Native segments descend from my mother based on genealogical and genetic analysis, combined. I’m going to walk you through the process.

I can utilize my genealogy to confirm or refute information shown by 23andMe. For example, if one of those segments comes from known ancestors who were living in Germany, it’s clearly not Native, and it’s noise of some type.

We’re going to utilize DNAPainter to determine which ancestors contributed your minority segments, but first you’ll need to download your ethnicity segments from 23andMe.

Downloading Ethnicity Segment Data

Downloading your ethnicity segments is NOT THE SAME as downloading your raw DNA results to transfer to another vendor. Those are two entirely different files and different procedures.

To download the locations of your ethnicity segments at 23andMe, scroll down below your painted ethnicity segments in your Ancestry Composition section to “View Scientific Details.”

MInority ethnicity scientific details.png

Click on View Scientific Details and scroll down to near the bottom and then click on “Download Raw Data.” I leave mine at the 50% confidence level.

Minority ethnicity download raw data.png

Save this spreadsheet to your computer in a known location.

In the spreadsheet, you’ll see columns that provide the name of the segment, the chromosome copy number (1 or 2) and the chromosome number with start and end locations.

Minority ethnicity download.png

You really don’t care about this information directly, but DNAPainter does and you’ll care a lot about what DNAPainter does for you.

DNAPainter

I wrote introductory articles about DNAPainter:

If you’re not familiar with DNAPainter, you might want to read these articles first and then come back to this point in this article.

Go ahead – I’ll wait!

Getting Started

If you don’t have a DNAPainter account, you’ll need to create one for free. Some features, such as having multiple profiles are subscription based, but the functionality you’ll need for one profile is free.

I’ve named this example profile “Ethnicity Demo.” You’ll see your name where mine says “Ethnicity Demo.”

Minority ethnicity DNAPainter.png

Click on “Import 23andme ancestry composition.”

You will copy and paste all the spreadsheet rows in the entire downloaded 23andMe ethnicity spreadsheet into the DNAPainter text box and make your selection, below. The great news is that if you discover that your assumption about copy 1 being maternal or paternal is incorrect, it’s easy to delete the ethnicity segments entirely and simply repaint later. Ditto if 23andMe changes your estimate over time, like they have mine.

Minority ethnicity DNAPainter sides.png

I happen to know that “copy 2” is maternal, so I’ve made that selection.

You can then see your ethnicity chromosome segments painted, and you can expand each one to see the detail. Click on “Save Segments.”

MInority ethnicity DNAPainter Native painting

Click to enlarge

In this example, you can see my Native segments, called by various names at different confidence levels at 23andMe, on chromosome 1.

Depending on the confidence level, these segments are called some mixture of:

  • East Asian & Native American
  • North Asian & Native American
  • Native American
  • Broadly East Asian & Native American

It’s exactly the same segment, so you don’t really care what it’s called. DNAPainter paints all of the different descriptions provided by 23andMe, at all confidence levels as you can see above.

The DNAPainter colors are different from 23andMe colors and are system-selected. You can’t assign the colors for ethnicity segments.

Now, I’m moving to my own profile that I paint with my ancestral segments. To date, I have 78% of my segments painted by identifying cousins with known common ancestors.

On chromosomes 1 and 2, copy 2, which I’ve determined to be my mother’s “side,” these segments track back to specific ancestors.

Minority ethnicity maternal side

Click to enlarge

Chromosome 1 segments, above, track back to the Lore family, descended from Antoine (Anthony) Lore (Lord) who married Rachel Hill. Antoine Lore was Acadian.

Minority ethnicity chromosome 1.png

Clicking on the green segment bar shows me the ancestors I assigned when I painted the match with my Lore family member whose name is blurred, but whose birth surname was Lore.

The Chromosome 2 segment, below, tracks back to the same family through a match to Fred.

Minority ethnicity chromosome 2.png

My common ancestors with Fred are Honore Lore and Marie Lafaille who are the parents of Antoine Lore.

Minority ethnicity common ancestor.png

There are additional matches on both chromosomes who also match on portions of the Native segments.

Now that I have a pointer in the ancestral direction that these Native American segments arrived from, what can traditional genealogy and other DNA information tell me?

Traditional Genealogy Research

The Acadian people were a mixture of English, French and Native American. The Acadians settled on the island of Nova Scotia in 1609 and lived there until being driven out by the English in 1755, roughly 6 or 7 generations later.

Minority ethnicity Acadian map.png

The Acadians intermarried with the Mi’kmaq people.

It had been reported by two very qualified genealogists that Philippe Mius, born in 1660, married two Native American women from the Mi’kmaq tribe given the name Marie.

The French were fond of giving the first name of Marie to Native women when they were baptized in the Catholic faith which was required before the French men were allowed to marry the Native women. There were many Native women named Marie who married European men.

Minority ethnicity Native mitochondrial tree

Click to enlarge

This Mius lineage is ancestral to Antoine Lore (Lord) as shown on my pedigree, above.

Mitochondrial DNA has revealed that descendants from one of Philippe Mius’s wives, Marie, carry haplogroup A2f1a.

However, mitochondrial tests of other descendants of “Marie,” his first wife, carry haplogroup X2a2, also Native American.

Confusion has historically existed over which Marie is the mother of my ancestor, Francoise.

Karen Theroit Reader, another professional genealogist, shows Francoise Mius as the last child born to the first Native wife before her death sometime after 1684 and before about 1687 when Philippe remarried.

However, relative to the source of Native American segments, whether Francoise descends from the first or second wife doesn’t matter in this instance because both are Native and are proven so by their mitochondrial DNA haplogroups.

Additionally, on Antoine’s mother’s side, we find a Doucet male, although there are two genetic male Doucet lines, one of European origin, haplogroup R-L21, and one, surprisingly, of Native origin, haplogroup C-P39. Both are proven by their respective haplogroups but confusion exists genealogically over who descends from which lineage.

On Antoine’s mother’s side, there are several unidentified lineages, any one or multiples of which could also be Native. As you can see, there are large gaps in my tree.

We do know that these Native segments arrived through Antoine Lore and his parents, Honore Lore and Marie LaFaille. We don’t know exactly who upstream contributed these segments – at least not yet. Painting additional matches attributable to specific ancestral couples will eventually narrow the candidates and allow me to walk these segments back in time to their rightful contributor.

Segments, Traditional Research and DNAPainter

These three tools together, when using continent-level segments in combination with painting the DNA segments of known cousins that match specific lineages create a triangulated ethnicity segment.

When that segment just happens to be genealogically important, this combination can point the researchers in the right direction knowing which lines to search for that minority ancestor.

If your cousins who match you on this segment have also tested with 23andMe, they should also be identified as Native on this same segment. This process does not apply to intracontinental segments, meaning within Europe, because the admixture is too great and the ethnicity predictions are much less reliable.

When identifying minority admixture at the continental level, adding Y and mitochondrial DNA testing to the mix in order to positively identify each individual ancestor’s Y and mitochondrial DNA is very important in both eliminating and confirming what autosomal DNA and genealogy records alone can’t do. The base haplogroup as assigned at 23andMe is a good start, but it’s not enough alone. Plus, we only carry one line of mitochondrial DNA and only males carry Y DNA, and only their direct paternal line.

We need Y and mitochondrial DNA matching at FamilyTreeDNA to verify the specific lineage. Additionally, we very well may need the Y and mitochondrial DNA information that we don’t directly carry – but other cousins do. You can read about Y and mitochondrial DNA testing, here.

I wrote about creating a personal DNA pedigree chart including your ancestors’ Y and mitochondrial DNA here. In order to find people descended from a specific ancestor who have DNA tested, I utilize:

  • WikiTree resources and trees
  • Geni trees
  • FamilySearch trees
  • FamilyTreeDNA autosomal matches with trees
  • AncestryDNA autosomal matches and their associated trees
  • Ancestry trees in general, meaning without knowing if they are related to a DNA match
  • MyHeritage autosomal matches and their trees
  • MyHeritage trees in general

At both MyHeritage and Ancestry, you can view the trees of your matches, but you can also search for ancestors in other people’s trees to see who might descend appropriately to provide a Y or mitochondrial DNA sample. You will probably need a subscription to maximize these efforts. My Heritage offers a free trial subscription here.

If you find people appropriately descended through WikiTree, Geni or FamilySearch, you’ll need to discuss DNA testing with them. They may have already tested someplace.

If you find people who have DNA tested through your DNA matches with trees at Ancestry and MyHeritage, you’ll need to offer a Y or mitochondrial DNA test to them if they haven’t already tested at FamilyTreeDNA.

FamilyTreeDNA is the only vendor who provides the Y DNA and mitochondrial DNA tests at the higher resolution level, beyond base haplogroups, required for matching and for a complete haplogroup designation.

If the person has taken the Family Finder autosomal test at FamilyTreeDNA, they may have already tested their Y DNA and mtDNA, or you can offer to upgrade their test.

Projects

Checking projects at FamilyTreeDNA can be particularly useful when trying to discover if anyone from a specific lineage has already tested. There are many, special interest projects such as the Acadian AmerIndian Ancestry project, the American Indian project, haplogroup projects, surname projects and more.

You can view projects alphabetically here or you can click here to scroll down to enter the surname or topic you are seeking.

Minority ethnicity project search.png

If the topic isn’t listed, check the alphabetic index under Geographical Projects.

23andMe Maternal and Paternal Sides

If possible, you’ll want to determine which “side” of your family your minority segments originate come from, unless they come from both. you’ll want to determine whether chromosome side one 1 or 2 is maternal, because the other one will be paternal.

23andMe doesn’t offer tree functionality in the same way as other vendors, so you won’t be able to identify people there descended from your ancestors without contacting each person or doing other sleuthing.

Recently, 23andMe added a link to FamilySearch that creates a list of your ancestors from their mega-shared tree for 7 generations, but there is no tree matching or search functionality. You can read about the FamilySearch connection functionality here.

So, how do you figure out which “side” is which?

Minority ethnicity minority segment.png

The chart above represents the portion of your chromosomes that contains your minority ancestry. Initially, you don’t know if the minority segment is your mother’s pink chromosome or your father’s blue chromosome. You have one chromosome from each parent with the exact same addresses or locations, so it’s impossible to tell which side is which without additional information. Either the pink or the blue segment is minority, but how can you tell?

In my case, the family oral history regarding Native American ancestry was from my father’s line, but the actual Native segments wound up being from my mother, not my father. Had I made an assumption, it would have been incorrect.

Fortunately, in our example, you have both a maternal and paternal aunt who have tested at 23andMe. You match both aunts on that exact same segment location – one from your father’s side, blue, and one from your mother’s side, pink.

You compare your match with your maternal aunt and verify that indeed, you do match her on that segment.

You’ll want to determine if 23andMe has flagged that segment as Native American for your maternal aunt too.

You can view your aunt’s Ancestry Composition by selecting your aunt from the “Your Connections” dropdown list above your own ethnicity chromosome painting.

Minority ethnicity relative connections.png

You can see on your aunt’s chromosomes that indeed, those locations on her chromosomes are Native as well.

Minority ethnicity relative minority segments.png

Now you’ve identified your minority segment as originating on your maternal side.

Minority ethnicity Native side.png

Let’s say you have another match, Match 1, on that same segment. You can easily tell which “side” Match 1 is from. Since you know that you match your maternal aunt on that minority segment, if Match 1 matches both you and your maternal aunt, then you know that’s the side the match is from – AND that person also shares that minority segment.

You can also view that person’s Ancestry Composition as well, but shared matching is more reliable,especially when dealing with small amounts of minority admixture.

Another person, Match 2, matches you on that same segment, but this time, the person matches you and your paternal aunt, so they don’t share your minority segment.

Minority ethnicity match side.png

Even if your paternal aunt had not tested, because Match 2 does not match you AND your maternal aunt, you know Match 2 doesn’t share your minority segment which you can confirm by checking their Ancestry Composition.

Download All of Your Matches

Rather than go through your matches one by one, it’s easiest to download your entire match list so you can see which people match you on those chromosome locations.

Minority ethnicity download aggregate data.png

You can click on “Download Aggregate Data” at 23andMe, at the bottom of your DNA Relatives match list to obtain all of your matches who are sharing with you. 23andMe limits your matches to 2000 or less, the actual number being your highest 2000 matches minus the people who aren’t sharing. I have 1465 matches showing and that number decreases regularly as new testers at 23andMe are focused on health and not genealogy, meaning lower matches get pushed off the list of 2000 match candidates.

You can quickly sort the spreadsheet to see who matches you on specific segments. Then, you can check each match in the system to see if that person matches you and another known relative on the minority segments or you can check their Ancestry Composition, or both.

If they share your minority segment, then you can check their tree link if they have one, included in the download, their Family Search information if included on their account, or reach out to them to see if you might share a known ancestor.

The key to making your ethnicity segment work for you is to identify ancestors and paint known matches.

Paint Those Matches

When searching for matches whose DNA you can attribute to specific ancestors, be sure to check at all 4 places that provide segment information that you can paint:

At GedMatch, you’ll find some people who have tested at the other various vendors, including Ancestry, but unfortunately not everyone uploads. Ancestry doesn’t provide segment information, so you won’t be able to paint those matches directly from Ancestry.

If your Ancestry matches transfer to GedMatch, FamilyTreeDNA or MyHeritage you can view your match and paint your common segments. At GedMatch, Ancestry kit numbers begin with an A. I use my Ancestry kit matches at GedMatch to attempt to figure out who that match is at Ancestry in order to attempt to figure out the common ancestor.

To Paint, You Must Test

Of course, in order to paint your matches that you find in various databases, you need to be in those data bases, meaning you either need to test there or transfer your DNA file.

Transfers

If you’d like to test your DNA at one vendor and download the file to transfer to another vendor, or GedMatch, that’s possible with both FamilyTreeDNA and MyHeritage who both accept uploads.

You can transfer kits from Ancestry and 23andMe to both FamilyTreeDNA and MyHeritage for free, although the chromosome browsers, advanced tools and ethnicity require an unlock fee (or alternatively a subscription at MyHeritage). Still, the free transfer and unlock for $19 at FamilyTreeDNA or $29 at MyHeritage is less than the cost of testing.

Here’s a quick cheat sheet.

DNA vendor transfer cheat sheet 2019

From time to time, as vendor file formats change, the ability to transfer is temporarily interrupted, but it costs nothing to try a transfer to either MyHeritage or FamilyTreeDNA, or better yet, both.

In each of these articles, I wrote about how to download your data from a specific vendor and how to upload from other vendors if they accept uploads.

Summary Steps

In order to use your minority ethnicity segments in your genealogy, you need to:

  1. Test at 23andMe
  2. Identify which parental side your minority ethnicity segments are from, if possible
  3. Download your ethnicity segments
  4. Establish a DNAPainter account
  5. Upload your ethnicity segments to DNAPainter
  6. Paint matches of people with whom you share known common ancestors utilizing segment information from 23andMe, FamilyTreeDNA, MyHeritage and AncestryDNA matches who have uploaded to GedMatch
  7. If you have not tested at either MyHeritage or FamilyTreeDNA, upload your 23andMe file to either vendor for matching, along with GedMatch
  8. Focus on those minority segments to determine which ancestral line they descend through in order to identify the ancestor(s) who provided your minority admixture.

Have fun!

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

First Steps When Your DNA Results are Ready – Sticking Your Toe in the Genealogy Water

First steps helix

Recently someone asked me what the first steps would be for a person who wasn’t terribly familiar with genealogy and had just received their DNA test results.

I wrote an article called DNA Results – First Glances at Ethnicity and Matching which was meant to show new folks what the various vendor interfaces look like. I was hoping this might whet their appetites for more, meaning that the tester might, just might, stick their toe into the genealogy waters😊

I’m hoping this article will help them get hooked! Maybe that’s you!

A Guide

This article can be read in one of two ways – as an overview, or, if you click the links, as a pretty thorough lesson. If you’re new, I strongly suggest reading it as an overview first, then a second time as a deeper dive. Use it as a guide to navigate your results as you get your feet wet.

I’ll be hotlinking to various articles I’ve written on lots of topics, so please take a look at details (eventually) by clicking on those links!

This article is meant as a guideline for what to do, and how to get started with your DNA matching results!

If you’re looking for ethnicity information, check out the First Glances article, plus here and here and here.

Concepts – Calculating Ethnicity Percentages provides you with guidelines for how to estimate your own ethnicity percentages based on your known genealogy and Ethnicity Testing – A Conundrum explains how ethnicity testing is done.

OK, let’s get started. Fun awaits!

The Goal

The goal for using DNA matching in genealogy depends on your interests.

  1. To discover cousins and family members that you don’t know. Some people are interested in finding and meeting relatives who might have known their grandparents or great-grandparents in the hope of discovering new family information or photos they didn’t know existed previously. I’ve been gifted with my great-grandparent’s pictures, so this strategy definitely works!
  2. To confirm ancestors. This approach presumes that you’ve done at least a little genealogy, enough to construct at least a rudimentary tree. Ancestors are “confirmed” when you DNA match multiple other people who descend from the same ancestor through multiple children. I wrote an article, Ancestors: What Constitutes Proof?, discussing how much evidence is enough to actually confirm an ancestor. Confirmation is based on a combination of both genealogical records and DNA matching and it varies depending on the circumstances.
  3. Adoptees and people with unknown parents seeking to discover the identities of those people aren’t initially looking at their own family tree – because they don’t have one yet. The genealogy of others can help them figure out the identity of those mystery people. I wrote about that technique in the article, Identifying Unknown Parents and Individuals Using DNA Matching.

DNAAdoption for Everyone

Educational resources for adoptees and non-adoptees alike can be found at www.dnaadoption.org. DNAAdoption is not just for adoptees and provides first rate education for everyone. They also provide trained and mentored search angels for adoptees who understand the search process along with the intricacies of navigating the emotional minefield of adoption and unknown parent searches.

First Look” classes for each vendor are free for everyone at DNAAdoption and are self-paced, downloadable onto your computer as a pdf file. Intro to DNA, Applied Autosomal DNA and Y DNA Basics classes are nominally priced at between $29 and $49 and I strongly recommend these. DNAAdoption is entirely non-profit, so your class fee or contribution supports their work. Additional resources can be found here and their 12 adoptee search steps here.

Ok, now let’s look at your results.

Matches are the Key

Regardless of your goal, your DNA matches are the key to finding answers, whether you want to make contact with close relatives, prove your more distant ancestors or you’re involved in an adoptee or unknown parent search.

Your DNA matches that of other people because each of you inherited a piece of DNA, called a segment, where many locations are identical. The length of that DNA segment is measured in centiMorgans and those locations are called SNPs, or single nucleotide polymorphisms. You can read about the definition of a centimorgan and how they are used in the article Concepts – CentiMorgans, SNPs and Pickin’Crab.

While the scientific details are great, they aren’t important initially. What is important is to understand that the more closely you match someone, the more closely you are related to them. You share more DNA with close relatives than more distant relatives.

For example, I share exactly half of my mother’s DNA, but only about 25% of each of my grandparents’ DNA. As the relationships move further back in time, I share less and less DNA with other people who descend from those same ancestors.

Informational Tools

Every vendor’s match page looks different, as was illustrated in the First Glances article, but regardless, you are looking for four basic pieces of information:

  • Who you match
  • How much DNA you share with your match
  • Who else you and your match share that DNA with, which suggests that you all share a common ancestor
  • Family trees to reveal the common ancestor between people who match each other

Every vendor has different ways of displaying this information, and not all vendors provide everything. For example, 23andMe does not support trees, although they allow you to link to one elsewhere. Ancestry does not provide a tool called a chromosome browser which allows you to see if you and others match on the same segment of DNA. Ancestry only tells you THAT you match, not HOW you match.

Each vendor has their strengths and shortcomings. As genealogists, we simply need to understand how to utilize the information available.

I’ll be using examples from all 4 major vendors:

Your matches are the most important information and everything else is based on those matches.

Family Tree DNA

I have tested many family members from both sides of my family at Family Tree DNA using the Family Finder autosomal test which makes my matches there incredibly useful because I can see which family members, in addition to me, my matches match.

Family Tree DNA assigns matches to maternal and paternal sides in a unique way, even if your parents haven’t tested, so long as some close relatives have tested. Let’s take a look.

First Steps Family Tree DNA matches.png

Sign on to your account and click to see your matches.

At the top of your Family Finder matches page, you’ll see three groups of things, shown below.

First Steps Family Tree DNA bucketing

Click to enlarge

A row of tools at the top titled Chromosome Browser, In Common With and Not in Common With.

A second row of tabs that include All, Paternal, Maternal and Both. These are the maternal and paternal tabs I mentioned, meaning that I have a total of 4645 matches, 988 of which are from my paternal side and 847 of which are from my maternal side.

Family Tree DNA assigns people to these “buckets” based on matches with third cousins or closer if you have them attached in your tree. This is why it’s critical to have a tree and test close relatives, especially people from earlier generations like aunts, uncles, great-aunts/uncles and their children if they are no longer living.

If you have one or both parents that can test, that’s a wonderful boon because anyone who matches you and one of your parents is automatically bucketed, or phased (scientific term) to that parent’s side of the tree. However, at Family Tree DNA, it’s not required to have a parent test to have some matches assigned to maternal or paternal sides. You just need to test third cousins or closer and attach them to the proper place in your tree.

How does bucketing work?

Maternal or Paternal “Side” Assignment, aka Bucketing

If I match a maternal first cousin, Cheryl, for example, and we both match John Doe on the same segment, John Doe is automatically assigned to my maternal bucket with a little maternal icon placed beside the match.

First Steps Family Tree DNA match info

Click to enlarge

Every vendor provides an estimated or predicted relationship based on a combination of total centiMorgans and the longest contiguous matching segment. The actual “linked relationship” is calculated based on where this person resides in your tree.

The common surnames at far right are a very nice features, but not every tester provides that information. When the testers do include surnames at Family Tree DNA, common surnames are bolded. Other vendors have similar features.

People with trees are shown near their profile picture with a blue pedigree icon. Clicking on the pedigree icon will show you their ancestors. Your matches estimated relationship to you indicates how far back you should expect to share an ancestor.

For example, first cousins share grandparents. Second cousins share great-grandparents. In general, the further back in time your common ancestor, the less DNA you can be expected to share.

You can view relationship information in chart form in my article here or utilize DNAPainter tools, here, to see the various possibilities for the different match levels.

Clicking on the pedigree chart of your match will show you their tree. In my tree, I’ve connected my parents in their proper places, along with Cheryl and Don, mother’s first cousins. (Yes, they’ve given permission for me to utilize their results, so they aren’t always blurred in images.)

Cheryl and Don are my first cousins once removed, meaning my mother is their first cousin and I’m one generation further down the tree. I’m showing the amount of DNA that I share with each of them in red in the format of total DNA shared and longest unbroken segment, taken from the match list. So 382-53 means I share a total of 382 cM and 53 cM is the longest matching block.

First Steps Family Tree DNA tree.png

The Chromosome Browser

Utilizing the chromosome browser, I can see exactly where I match both Don and Cheryl. It’s obvious that I match them on at least some different pieces of my DNA, because the total and longest segment amounts are different.

The reason it’s important to test lots of close relatives is because even siblings inherit different pieces of DNA from their parents, and they don’t pass the same DNA to their offspring either – so in each generation the amount of shared DNA is probably reduced. I say probably because sometimes segments are passed entirely and sometimes not at all, which is how we “lose” our ancestors’ DNA over the generations.

Here’s a matching example utilizing a chromosome browser.

First Steps Family Tree DNA chromosome browser.png

I clicked the checkboxes to the left of both Cheryl and Don on the match page, then the Chromosome Browser button, and now you can see, above, on chromosomes 1-16 where I match Cheryl (blue) and Don (red.)

In this view, both Don and Cheryl are being compared to me, since I’m the one signed in to my account and viewing my DNA matches. Therefore, one of the bars at each chromosome represents Don’s DNA match to me and one represents Cheryl’s. Cheryl is the first person and Don is the second. Person match colors (red and blue) are assigned arbitrarily by the system.

My grandfather and Cheryl/Don’s father, Roscoe, were siblings.

You can see that on some segments, my grandfather and Roscoe inherited the same segment of DNA from their parents, because today, my mother gave me that exact same segment that I share with both Don and Cheryl. Those segments are exactly identical and shown in the black boxes.

The only way for us to share this DNA today is for us to have shared a common ancestor who gave it to two of their children who passed it on to their descendants who DNA tested today.

On other segments, in red boxes, I share part of the same segments of DNA with Cheryl and Don, but someone along the line didn’t inherit all of that segment. For example on chromosome 3, in the red box, you can see that I share more with Cheryl (blue) than Don (red.)

In other cases, I share with either Don or Cheryl, but Don and Cheryl didn’t inherit that same segment of DNA from their father, so I don’t share with both of them. Those are the areas where you see only blue or only red.

On chromosome 12, you can see where it looks like Don’s and Cheryl’s segments butt up against each other. The DNA was clearly divided there. Don received one piece and Cheryl got the other. That’s known as a crossover and you can read about crossovers here, if you’d like.

It’s important to be able to view segment information to be able to see how others match in order to identify which common ancestor that DNA came from.

In Common With

You can use the “In Common With” tool to see who you match in common with any match. My first 6 matches in common with Cheryl are shown below. Note that they are already all bucketed to my maternal side.

First Steps Family Tree DNA in common with

click to enlarge

You can click on up to 7 individuals in the check box at left to show them on the chromosome browser at once to see if they match you on common segments.

Each matching segment has its own history and may descend from a different ancestor in your common tree.

First Steps 7 match chromosome browser

click to enlarge

If combinations of people do match me on a common segment, because these matches are all on my maternal side, they are triangulated and we know they have to descend from a common ancestor, assuming the segment is large enough. You can read about the concept of triangulation here. Triangulation occurs when 3 or more people (who aren’t extremely closely related like parents or siblings) all match each other on the same reasonably sized segment of DNA.

If you want to download your matches and work through this process in a spreadsheet, that’s an option too.

Size Matters

Small segments can be identical by chance instead of identical by descent.

  • “Identical by chance” means that you accidentally match someone because your DNA on that segment has been combined from both parents and causes it to match another person, making the segment “looks like” it comes from a common ancestor, when it really doesn’t. When DNA is sequenced, both your mother and father’s strands are sequenced, meaning that there’s no way to determine which came from whom. Think of a street with Mom’s side and Dad’s side with identical addresses on the houses on both sides. I wrote about that here.
  • “Identical by descent” means that the DNA is identical because it actually descends from a common ancestor. I discussed that concept in the article, We Match, But Are We Related.

Generally, we only utilize 7cM (centiMorgan) segments and above because at that level, about half of the segments are identical by descent and about half are identical by chance, known as false positives. By the time we move above 15 cM, most, but not all, matches are legitimate. You can read about segment size and accuracy here.

Using “In Common With” and the Matrix

“In Common With” is about who shares DNA. You can select someone you match to see who else you BOTH match. Just because you match two other people doesn’t necessarily mean that it’s on the same segment of DNA. In fact, you could match one person from your mother’s side and the other person from your father’s side.

First Steps match matrix.png

In this example, you match Person B due to ancestor John Doe and Person C due to ancestor Susie Smith. However, Person B also matches person C, but due to ancestor William West that they share and you don’t.

This example shows you THAT they match, but not HOW they match.

The only way to assure that the matches between the three people above are due to the same ancestor is to look at the segments with a chromosome browser and compare all 3 people to each other. Finding 3 people who match on the same segment, from the same side of your tree means that (assuming a reasonably large segment) you share a common ancestor.

Family Tree DNA has a nice matrix function that allows you to see which of your matches also match each other.

First steps matrix link

click to enlarge

The important distinction between the matrix and the chromosome browser is that the chromosome browser shows you where your matches match you, but those matches could be from both sides of your tree, unless they are bucketed. The matrix shows you if your matches also match each other, which is a huge clue that they are probably from the same side of your tree.

First Steps Family Tree DNA matrix.png

A matrix match is a significant clue in terms of who descends from which ancestors. For example, I know, based on who Amy matches, and who she doesn’t match, that she descends from the Ferverda side and that Charles, Rex and Maxine descend from ancestors on the Miller side.

Looking in the chromosome browser, I can tell that Cheryl, Don, Amy and I match on some common segments.

Matching multiple people on the same segment that descends from a common ancestor is called triangulation.

Let’s take a look at the MyHeritage triangulation tool.

MyHeritage

Moving now to MyHeritage who provides us with an easy to use triangulation tool, we see the following when clicking on DNA matches on the DNA tab on the toolbar.

First Steps MyHeritage matches

click to enlarge

Cousin Cheryl is at MyHeritage too. By clicking on Review DNA Match, the purple button on the right, I can see who else I match in common with Cheryl, plus triangulation.

The list of people Cheryl and I both match is shown below, along with our relationships to each person.

First Steps MyHeritage triangulation

click to enlarge

I’ve selected 2 matches to illustrate.

The first match has a little purple icon to the right which means that Amy triangulates with me and Cheryl.

The second match, Rex, means that while we both match Rex, it’s not on the same segment. I know that without looking further because there is no triangulation button. We both match Rex, but Cheryl matches Rex on a different segment than I do.

Without additional genealogy work, using DNA alone, I can’t say whether or not Cheryl, Rex and I all share a common ancestor. As it turns out, we do. Rex is a known cousin who I tested. However, in an unknown situation, I would have to view the trees of those matches to make that determination.

Triangulation

Clicking on the purple triangulation icon for Amy shows me the segments that all 3 of us, me, Amy and Cheryl share in common as compared to me.

First Steps MyHeritage triangulation chromosome browser.png

Cheryl is red and Amy is yellow. The one segment bracketed with the rounded rectangle is the segment shared by all 3 of us.

Do we have a common ancestor? I know Cheryl and I do, but maybe I don’t know who Amy is. Let’s look at Amy’s tree which is also shown if I scroll down.

First Steps MyHeritage common ancestor.png

Amy didn’t have her tree built out far enough to show our common ancestor, but I immediately recognized the surname Ferveda found in her tree a couple of generations back. Darlene was the daughter of Donald Ferverda who was the son of Hiram Ferverda, my great-grandfather.

Hiram was the father of Cheryl’s father, Roscoe and my grandfather, John Ferverda.

First Steps Hiram Ferverda pedigree.png

Amy is my first cousin twice removed and that segment of DNA that I share with her is from either Hiram Ferverda or his wife Eva Miller.

Now, based on who else Amy matches, I can probably tell whether that segment descends from Hiram or Eva.

Viva triangulation!

Theory of Family Relativity

MyHeritage’s Theory of Family Relativity provides theories to people whose DNA matches regarding their common ancestor if MyHeritage can calculate how the 2 people are potentially related.

MyHeritage uses a combination of tools to make that connection, including:

  • DNA matches
  • Your tree
  • Your match’s tree
  • Other people’s trees at MyHeritage, FamilySearch and Geni if the common ancestor cannot be found in your tree compared against your DNA match’s MyHeritage
  • Documents in the MyHeritage data collection, such as census records, for example.

MyHeritage theory update

To view the Theories, click on the purple “View Theories” banner or “View theory” under the DNA match.

First Steps MyHeritage theory of relativity

click to enleage

The theory is displayed in summary format first.

MyHeritage view full theory

click to enlarge

You can click on the “View Full Theory” to see the detail and sources about how MyHeritage calculated various paths. I have up to 5 different theories that utilize separate resources.

MyHeritage review match

click to enlarge

A wonderful aspect of this feature is that MyHeritage shows you exactly the information they utilized and calculates a confidence factor as well.

All theories should be viewed as exactly that and should be evaluated critically for accuracy, taking into consideration sources and documentation.

I wrote about using Theories of Relativity, with instructions, here and here.

I love this tool and find the Theories mostly accurate.

AncestryDNA

Ancestry doesn’t offer a chromosome browser or triangulation but does offer a tree view for people that you match, so long as you have a subscription. In the past, a special “Light” subscription for DNA only was available for approximately $49 per year that provided access to the trees of your DNA matches and other DNA-related features. You could not order online and had to call support, sometimes asking for a supervisor in order to purchase that reduced-cost subscription. The “Light” subscription did not provide access to anything outside of DNA results, meaning documents, etc. I don’t know if this is still available.

After signing on, click on DNA matches on the DNA tab on the toolbar.

You’ll see the following match list.

First Steps Ancestry matches

click to enlarge

I’ve tested twice at Ancestry, the second time when they moved to their new chip, so I’m my own highest match. Click on any match name to view more.

First Steps Ancestry shared matches

click to enlarge

You’ll see information about common ancestors if you have some in your trees, plus the amount of shared DNA along with a link to Shared Matches.

I found one of the same cousins at Ancestry whose match we were viewing at MyHeritage, so let’s see what her match to me at Ancestry looks like.

Below are my shared matches with that cousin. The notes to the right are mine, not provided by Ancestry. I make extensive use of the notes fields provided by the vendors.

First Steps Ancestry shared matches with cousin

click to enlarge

On your match list, you can click on any match, then on Shared Matches to see who you both match in common. While Ancestry provides no chromosome browser, you can see the amount of DNA that you share and trees, if any exist.

Let’s look at a tree comparison when a common ancestor can be detected in a tree within the past 7 generations.

First Steps Ancestry view ThruLines.png

What’s missing of course is that I can’t see how we match because there’s no chromosome browser, nor can I see if my matches match each other.

Stitched Trees

What I can see, if I click on “View ThruLines” above or ThruLines on the DNA Summary page on the main DNA tab is all of the people I match who Ancestry THINKS we descend from a common ancestor. This ancestor information isn’t always taken from either person’s tree.

For example, if my match hadn’t included Hiram Ferverda in her tree, Ancestry would use other people’s trees to “stitch them together” such that the tester is shown to be descended from a common ancestor with me. Sometimes these stitched trees are accurate and sometimes they are not, although they have improved since they were first released. I wrote about ThruLines here.

First Steps Ancestry ThruLines tree

click to enlarge

In closer generations, especially if you are looking to connect with cousins, tree matching is a very valuable tool. In the graphic above, you can see all of the cousins who descend from Hiram Ferverda who have tested and DNA match to me. These DNA matches to me either descend from Hiram according to their trees, or Ancestry believes they descend from Hiram based on other people’s trees.

With more distant ancestors, other people’s trees are increasingly likely to be copied with no sources, so take them with a very large grain of salt (perchance the entire salt lick.) I use ThruLines as hints, not gospel, especially the further back in time the common ancestor. I wish they reached back another couple of generations. They are great hints and they end with the 7th generation where my brick walls tend to begin!

23andMe

I haven’t mentioned 23andMe yet in this article. Genealogists do test there, especially adoptees who need to fish in every pond.

23andMe is often the 4th choice of the major 4 vendors for genealogy due to the following challenges:

  • No tree support, other than allowing you to link to a tree at FamilySearch or elsewhere. This means no tree matching.
  • Less than 2000 matches, meaning that every person is limited to a maximum of 2000 matches, minus however many of those 2000 don’t opt-in for genealogical matching. Given that 23andMe’s focus is increasingly health, my number of matches continues to decrease and is currently just over 1500. The good news is that those 1500 are my highest, meaning closest matches. The bad news is the genealogy is not 23andMe’s focus.

If you are an adoptee, a die-hard genealogist or specifically interested in ethnicity, then test at 23andMe. Otherwise all three of the other vendors would be better choices.

However, like the other vendors, 23andMe does have some features that are unique.

Their ethnicity predictions are acknowledged to be excellent. Ethnicity at 23andMe is called Ancestry Composition, and you’ll see that immediately when you sign in to your account.

First Steps 23andMe DNA Relatives.png

Your matches at 23andMe are found under DNA Relatives.

First Steps 23andMe tools

click to enlarge

At left, you’ll find filters and the search box.

Mom’s and Dad’s side filter matches if you’ve tested your parents, but it’s not like the Family Tree DNA bucketing that provides maternal and paternal side bucketing by utilizing through third cousins if your parents aren’t available for testing.

Family names aren’t your family names, but the top family names that match to you. Guess what my highest name is? Smith.

However, Ancestor Birthplaces are quite useful because you can sort by country. For example, my mother’s grandfather Ferverda was born in the Netherlands.

First Steps 23andMe country.png

If I click on Netherlands, I can see my 5 matches with ancestors born in the Netherlands. Of course, this doesn’t mean that I match because of my match’s Dutch ancestors, but it does provide me with a place to look for a common ancestor and I can proceed by seeing who I match in common with those matches. Unfortunately, without trees we’re left to rely on ancestor birthplaces and family surnames, if my matches have entered that information.

One of my Dutch matches also matches my Ferverda cousin. Given that connection, and that the Ferverda family immigrated from Holland in 1868, that’s a starting point.

MyHeritage has a similar features and they are much more prevalent in Europe.

By clicking on my Ferverda cousin, I can view the DNA we share, who we match in common, our common ethnicity and more. I have the option of comparing multiple people in the chromosome browser by clicking on “View DNA Comparison” and then selecting who I wish to compare.

First Steps 23andMe view DNA Comparison.png

By scrolling down instead of clicking on View DNA Comparison, I can view where my Ferverda cousin matches me on my chromosomes, shown below.

First STeps 23andMe chromosome browser.png

23andMe identifies completely identical segments which would be painted in dark purple, the legend at bottom left.

Adoptees love this feature because it would immediately differentiate between half and full siblings. Full siblings share approximately 25% of the exact DNA on both their maternal and paternal strands of DNA, while half siblings only share the DNA from one parent – assuming their parents aren’t closely related. I share no completely identical DNA with my Ferverda cousin, so no segments are painted dark purple.

23andMe and Ancestry Maps Show Where Your Matches Live

Another reason that adoptees and people searching for birth parents or unknown relatives like 23andMe is because of the map function.

After clicking on DNA Relatives, click on the Map function at the top of the page which displays the following map.

First Steps 23andMe map

click to enlarge

This isn’t a map of where your matches ancestors lived, but is where your matches THEMSELVES live. Furthermore, you can zoom in, click on the button and it displays the name of the individual and the city where they live or whatever they entered in the location field.

First Steps 23andMe your location on map.png

I entered a location in my profile and confirmed that the location indeed displays on my match’s maps by signing on to another family member’s account. What I saw is the display above. I’d wager that most testers don’t realize that their home location and photo, if entered, is being displayed to their matches.

I think sharing my ancestors’ locations is a wonderful, helpful, idea, but there is absolutely no reason whatsoever for anyone to know where I live and I feel it’s stalker-creepy and a safety risk.

First Steps 23andMe questions.png

If you enter a location in this field in your profile, it displays on the map.

If you test with 23andMe and you don’t want your location to display on this map to your matches, don’t answer any question that asks you where you call home or anything similar. I never answer any questions at 23andMe. They are known for asking you the same question repeatedly, in multiple locations and ways, until you relent and answer.

Ancestry has a similar map feature and they’ve also begun to ask you questions that are unrelated to genealogy.

Ancestry Map Shows Where Your Matches Live

At Ancestry, when you click to see your DNA matches, look to the right at the map link.

First Steps Ancestry map link.png

By clicking on this link, you can see the locations that people have entered into their profile.

First Steps Ancestry match map.png

As you can see, above, I don’t have a location entered and I am prompted for one. Note that Ancestry does specifically say that this location will be shown to your matches.

You can click on the Ancestry Profile link here, or go to your Personal Profile by click the dropdown under your user name in the upper right hand corner of any page.

This is important because if you DON’T want your location to show, you need to be sure there is nothing entered in the location field.

First Steps Ancestry profile.png

Under your profile, click “Edit.”

First Steps Ancestry edit profile.png

After clicking edit, complete the information you wish to have public or remove the information you do not.

First Steps Ancestry location in profile.png

Sometimes Your Answer is a Little More Complicated

This is a First Steps article. Sometimes the answer you seek might be a little more complicated. That’s why there are specialists who deal with this all day, everyday.

What issues might be more complex?

If you’re just starting out, don’t worry about these things for now. Just know when you run into something more complex or that doesn’t make sense, I’m here and so are others. Here’s a link to my Help page.

Getting Started

What do you need to get started?

  • You need to take a DNA test, or more specifically, multiple DNA tests. You can test at Ancestry or 23andMe and transfer your results to both Family Tree DNA and MyHeritage, or you can test directly at all vendors.

Neither Ancestry nor 23andMe accept uploads, meaning other vendors tests, but both MyHeritage and Family Tree DNA accept most file versions. Instructions for how to download and upload your DNA results are found below, by vendor:

Both MyHeritage and Family Tree DNA charge a minimal fee to unlock their advanced features such as chromosome browsers and ethnicity if you upload transfer files, but it’s less costly in both cases than testing directly. However, if you want the MyHeritage DNA plus Health or the Family Tree DNA Y DNA or Mitochondrial DNA tests, you must test directly at those companies for those tests.

  • It’s not required, but it would be in your best interest to build as much of a tree at all three vendors as you can. Every little bit helps.

Your first tree-building step should be to record what your family knows about your grandparents and great-grandparents, aunts and uncles. Here’s what my first step attempt looked like. It’s cringe-worthy now, but everyone has to start someplace. Just do it!

You can build a tree at either Ancestry or MyHeritage and download your tree for uploading at the other vendors. Or, you can build the tree using genealogy software on your computer and upload to all 3 places. I maintain my primary tree on my computer using RootsMagic. There are many options. MyHeritage even provides free tree builder software.

Both Ancestry and MyHeritage offer research/data subscriptions that provide you with hints to historical documents that increase what you know about your ancestors. The MyHeritage subscription can be tried for free. I have full subscriptions to both Ancestry and MyHeritage because they both include documents in their collections that the other does not.

Please be aware that document suggestions are hints and each one needs to be evaluated in the context of what you know and what’s reasonable. For example, if your ancestor was born in 1750, they are not included in the 1900 census, nor do women have children at age 70. People do have exactly the same names. FindAGrave information is entered by humans and is not always accurate. Just sayin’…

Evaluate critically and skeptically.

Ok, Let’s Go!

When your DNA results are ready, sign on to each vendor, look at your matches and use this article to begin to feel your way around. It’s exciting and the promise is immense. Feel free to share the link to this article on social media or with anyone else who might need help.

You are the cumulative product of your ancestors. What better way to get to know them than through their DNA that’s shared between you and your cousins!

What can you discover today?

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

MyHeritage LIVE Conference Day 2 – The Science Behind DNA Matching    

The MyHeritage LIVE Oslo conference is but a fond memory now, and I would count it as a resounding success.

Perhaps one of the reasons I enjoyed it so much is the scientific aspect and because the content is very focused on a topic I enjoy without being the size and complexity of Rootstech. The smaller, more intimate venue also provides access to the “right” people as well as the ability to meet other attendees and not be overwhelmed by the sheer size.

Here are some stats:

  • 401 registered guests
  • 28 countries represented including distant places like Australia and South America
  • More than 20 speakers plus the hands-on workshops where specialist teams worked with students
  • 38 sessions and workshops, plus the party
  • 60,000 livestream participants, in spite of the time differences around the world

I was blown away by the number of livestream attendees.

I don’t know what criteria Gilad Japhet will be using to determine “success” but I can’t imagine this conference being judged as anything but.

Let’s take a look at the second day. I spent part of the time talking to people and drifting in and out of the rear of several sessions for a few minutes. I meant to visit some of the workshops, but there was just too much good, distracting content elsewhere.

I began Sunday in Mike Mansfield’s presentation about SuperSearch. Yes, I really did attend a few sessions not about DNA, but my favorite was the session on Improved DNA Matching.

Improved DNA Matching

I’m sure it won’t surprise any of my readers that my favorite presentations were about the actual science of genetic genealogy.

Consumers don’t really need to understand the science behind autosomal results to reap the benefits, but the underlying science is part of what I love – and it’s important for me to understand the underpinnings to be able to unravel the fine points of what the resulting matches are and are not revealing. Misinterpretation of DNA results leading to faulty conclusions is a real issue in genetic genealogy today. Consequently, I feel that anyone working with other people’s results and providing advice really needs to understand how the science and technology together works.

Dr. Daphna Weissglas-Volkov, a population geneticist by training, although she clearly functions far beyond that scope today, gave a very interesting presentation about how MyHeritage handles (their greatly improved) DNA Matching. I’m hitting the high points here, but I would strongly encourage you to watch the video of this session when they are made available online.

In addition to Dr. Weissglas-Volkov’s slides, I’ve added some additional explanations and examples in various places. You can easily tell that the slides are hers and the graphics that aren’t MyHeritage slides are mine.

Dr. Weissglas-Volkov began the session by introducing the MyHeritage science team and then explaining terminology to set the stage.

A match is when two people match each other on a fairly long piece of DNA. Of course, “fairly long” is defined differently by each vendor.

Your genetic map (of your chromosomes) is comprised of the DNA you inherit from different ancestors by the process of recombination when DNA is transferred from the parents to the child. A centiMorgan is the relatively likelihood that a recombination will occur in a single generation. On average, 36 recombinations occur in each generation, meaning that the DNA is divided on any chromosome. However, women, for reasons unknown have about 1.5 times as many recombinations as men.

You can’t see that when looking at an example of a person compared to their parents, of course, because each individual is a full match to each parent, but you can see this visually when comparing a grandchild to their maternal grandmother and their paternal grandmother on a chromosome browser.

The above illustration is the same female grandchild compared to her maternal grandmother, at left, and her paternal grandmother at right. Therefore the number of crossovers at left is through a female child (her mother), and the number at right is through a male child (her father.)

# of Crossovers
Through female child – left 57
Through male child – right 22

There are more segments at left, through the mother, and the segments are generally shorter, because they have been divided into more pieces.

At right, fewer and larger segments through the father.

Keep in mind that because you have a strand of DNA from each parent, with exactly the same “street addresses,” that what is produced by DNA sequencing are two columns of data – but your Mom’s and Dad’s DNA is intermixed.

The information in the two columns can’t be identified as Mom’s or Dad’s DNA or strand at this point.

That interspersed raw data is called a genotype. A haplotype is when Mom’s and Dad’s DNA can be reassembled into “sides” so you can attribute the two letters at each address to either Mom or Dad.

Here’s a quick example.

The goal, of course, is to figure out how to reassemble your DNA into Mom’s side and Dad’s side so that we know that someone matching you is actually matching on all As (Mom) or all Gs (Dad,) in this example, and not a false match that zigzags back and forth between Mom and Dad.

The best way to accomplish that goal of course is trio phasing, when the child and both parents are available, so by comparing the child’s DNA with the parents you can assign the two strands of the child’s DNA.

Unfortunately, few people have both or even one parent available in order to actual divide their DNA into “sides,” so the next best avenue is statistical phasing. I’ve called this academic phasing in the past, as compared to parental phasing which MyHeritage refers to as trio phasing.

There’s a huge amount of confusion about phasing, with few people understanding there are two distinct types.

Statistical phasing is a type of machine learning where a large number of reference populations are studied. Since we know that DNA travels together in blocks when inherited, statistical phasing learns which DNA travels with which buddy DNA – and creates probabilities. Your DNA is then compared to these models and your DNA is reshuffled in order to assemble your DNA into two groups – one representing your Mom’s DNA and one representing your Dad’s DNA, according to statistical probability.

Looking at your genotype, if we know that As group together at those 6 addresses in my example 95% of the time, then we know that the most likely scenario to create a haplotype is that all of the As came from one parent and all of the Gs from the other parent – although without additional information, there is no way to yet assign the maternal and paternal identifier. At this point, we only know parent 1 and parent 2.

In order to train the computers (machine learning) to properly statistically phase testers’ results, MyHeritage uses known relationships of people to teach the machines. In other words, their reference panels of proven haplotypes grows all of the time as parent/child trios test.

Dr. Weissglas-Volkev then moved on to imputation.

When sequencing DNA, not every location reads accurately, so the missing values can be imputed, or “put back” using imputation.

Initially imputation was a hot mess. Not just for MyHeritage, but for all vendors, imputation having been forced upon them (and therefore us) by Illumina’s change to the GSA chip.

However, machine learning means that imputation models improve constantly, and matching using imputation is greatly improved at MyHeritage today.

Imputation can do more than just fill in blanks left by sequencing read errors.

The benefit of imputation to the genetic genealogy community is that vendors using disparate chips has forced vendors that want to allow uploads to utilize imputation to create a global template that incorporates all of the locations from each vendor, then impute the values they don’t actually test for themselves to complete the full template for each person.

In the example below, you can see that no vendor tests all available locations, but when imputation extends the sequences of all testers to the full 1-500 locations, the results can easily be compared to every other tester because every tester now has values in locations 1-500, regardless of which vendor/chip was utilized in their actual testing.

Therefore, using imputation, MyHeritage is able to match between quite disparate chips, such as the traditional Illumina chips (OmniExpress), the custom Ancestry chip and the new GSA chip utilized by 23andMe and LivingDNA.

So, how are matches determined?

Matching

First your DNA and that of another person are scanned for nearly identical seed sequences.

A minimum segment length of 6cM must be identified for further match processing to occur. Anything below 6cM is discarded at this point.

The match is then further evaluated to see if the seed match is of a high enough quality that it should be perfected and should count as a match. Other segments continue to be evaluated as well. If the total matching segment(s) is 8 total cM or greater, it’s considered a valid match. MyHeritage has taken the position that they would rather give you a few accidental false matches than to miss good matches. I appreciate that position.

Window cleaning is how they refer to the process of removing pileup regions known to occur in the human genome. This is NOT the same as Ancestry’s routine that removes areas they determine to be “too matchy” for you individually.

The difference is that in humans, for example, there is a segment of chromosome 6 where, for some reason, almost all humans match. Matching across that segment is not informative for genetic genealogy, so that region along with several others similar in nature are removed. At Ancestry, those genome-wide pileup segments are removed, along with other regions where Ancestry decides that you personally have too many matches. The problem is that for me, these “too matchy” segments are many of my Acadian matches. Acadians are endogamous, so lots of them match each other because as a small intermarried population, they share a great deal of the same DNA. However, to me, because I have one great-grandfather that’s Acadian, that “too matchy” information IS valuable although I understand that it wouldn’t be for someone that is 100% Acadian or Jewish.

In situations such as Ashkenazi Jewish matching, which is highly endogamous, MyHeritage uses a higher matching threshold. Otherwise every Ashkenazi person would match every other Ashkenazi person because they all descend from a small founder population, and for genealogy, that’s not useful.

The last step in processing matches is to establish the confidence level that the match is accurately predicted at the correct level – meaning the relationship range based on the amount of matching DNA and other criteria.

For example, does this match cluster with other proven matches of the same known relationship level?

From several confidence ascertainment steps, a confidence score is assigned to the predicted relationship.

Of course, you as a customer see none of this background processing, just the fact that you do match, the size of the match and the confidence score. That’s what genealogists need!

Matching Versus Triangulation Thresholds

Confusion exists about matching thresholds versus triangulation thresholds.

While any single segment must be over 6 cM in length for the matching process to begin, the actual match threshold at MyHeritage is a total of 8 cM.

I took a look at my lowest match at MyHeritage.

I have two segments, one 6.1 cM segment, and one 6 cM segment that match. It would appear that if I only had one 6 cM segment, it would not show as a match because I didn’t have the minimum 8 cM total.

Triangulation Threshold

However, after you pass that matching criteria and move on to triangulation with a matching individual, you have the option of selecting the triangulation threshold, which is not the same thing as the match threshold. The match threshold does not change, but you can change the triangulation threshold from 2 cM to 8 cM and selections in-between.

In the example below, I’m comparing myself against two known relatives.

You won’t be shown any matches below the 6 cM individual segment threshold, BUT you can view triangulated segments of different sizes. This is because matching segments often don’t line up exactly and the triangulated overlap between several individuals may be very small, but may still be useful information.

Flying your mouse over the location in the bubble, which is the triangulated segment, tells you the size of the triangulated portion. If you selected the 2 cM triangulation, you would see smaller triangulated portions of matches.

Closing Session

The conference was closed by Aaron Godfrey, a super-nice MyHeritage employee from the UK. The closing session is worth watching on the recorded livestream when it becomes available, in part because there are feel good moments.

However, the piece of information I was looking for was whether there will be a MyHeritage LIVE conference in 2019, and if so, where.

I asked Gilad afterwards and he said that they will be evaluating the feedback from attendees and others when making that decision.

So, if you attended or joined the livestream sessions and found value, please let MyHeritage know so that they can factor your feedback onto their decision. If there are topics you’d like to see as sessions, I’m sure they’d love to hear about that too. Me, I’m always voting for more DNA😊

I hope to hear about MyHeritage LIVE 2019, and I’m voting for any of the following locations:

  • Australia
  • New Zealand
  • Israel
  • Germany
  • Switzerland

What do you think?

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Elizabeth Warren’s Native American DNA Results: What They Mean

Elizabeth Warren has released DNA testing results after being publicly challenged and derided as “Pochahontas” as a result of her claims of a family story indicating that her ancestors were Native America. If you’d like to read the specifics of the broo-haha, this Washington Post Article provides a good summary, along with additional links.

I personally find name-calling of any type unacceptable behavior, especially in a public forum, and while Elizabeth’s DNA test was taken, I presume, in an effort to settle the question and end the name-calling, what it has done is to put the science of genetic testing smack dab in the middle of the headlines.

This article is NOT about politics, it’s about science and DNA testing. I will tell you right up front that any comments that are political or hateful in nature will not be allowed to post, regardless of whether I agree with them or not. Unfortunately, these results are being interpreted in a variety of ways by different individuals, in some cases to support a particular political position. I’m presenting the science, without the politics.

This is the first of a series of two articles.

I’m dividing this first article into four sections, and I’d ask you to read all four, especially before commenting. A second article, Possibilities – Wringing the Most Out of Your DNA Ethnicity Test will follow shortly about how to get the most out of an ethnicity test when hunting for Native American (or other minority, for you) ethnicity.

Understanding how the science evolved and works is an important factor of comprehending the results and what they actually mean, especially since Elizabeth’s are presented in a different format than we are used to seeing. What a wonderful teaching opportunity.

  • Family History and DNA Science – How this works.
  • Elizabeth Warren’s Genealogy
  • Elizabeth Warren’s DNA Results
  • Questions and Answers – These are the questions I’m seeing, and my science-based answers.

My second article, Possibilities – Wringing the Most Out of Your DNA Ethnicity Test will include:

  • Potential – This isn’t all that can be done with ethnicity results. What more can you do to identify that Native ancestor?
  • Resources with Step by Step Instructions

Now, let’s look at Elizabeth’s results and how we got to this point.

Family Stories and DNA

Every person that grows up in their biological family hears family stories. We have no reason NOT to believe them until we learn something that potentially conflicts with the facts as represented in the story.

In terms of stories handed down for generations, all we have to go on, initially, are the stories themselves and our confidence in the person relating the story to us. The day that we begin to suspect that something might be amiss, we start digging, and for some people, that digging begins with a DNA test for ethnicity.

My family had that same Cherokee story. My great-grandmother on my father’s side who died in 1918 was reportedly “full blooded Cherokee” 60 years later when I discovered she had existed. Her brothers reportedly went to Oklahoma to claim headrights land. There were surely nuggets of truth in that narrative. Family members did indeed to go Oklahoma. One did own Cherokee land, BUT, he purchased that land from a tribal member who received an allotment. I discovered that tidbit later.

What wasn’t true? My great-grandmother was not 100% Cherokee. To the best of my knowledge now, a century after her death, she wasn’t Cherokee at all. She probably wasn’t Native at all. Why, then, did that story trickle down to my generation?

I surely don’t know. I can speculate that it might have been because various people were claiming Native ancestry in order to claim land when the government paid tribal members for land as reservations were dissolved between 1893 and 1914. You can read more about that in this article at the National Archives about the Dawes Rolls, compiled for the Cherokee, Creek, Choctaw, Chickasaw and Seminole for that purpose.

I can also speculate that someone in the family was confused about the brother’s land ownership, especially since it was Cherokee land.

I could also speculate that the confusion might have resulted because her husband’s father actually did move to Oklahoma and lived on Choctaw land.

But here is what I do know. I believed that story because there wasn’t any reason NOT to believe it, and the entire family shared the same story. We all believed it…until we discovered evidence through DNA testing that contradicted the story.

Before we discuss Elizabeth Warren’s actual results, let’s take a brief look at the underlying science.

Enter DNA Testing

DNA testing for ethnicity was first introduced in a very rudimentary form in 2002 (not a typo) and has progressed exponentially since. The major vendors who offer tests that provide their customers with ethnicity estimates (please note the word estimates) have all refined their customer’s results several times. The reference populations improve, the vendor’s internal software algorithms improve and population genetics as a science moves forward with new discoveries.

Note that major vendors in this context mean Family Tree DNA, 23andMe, the Genographic Project and Ancestry. Two newer vendors include MyHeritage and LivingDNA although LivingDNA is focused on England and MyHeritage, who utilizes imputation is not yet quite up to snuff on their ethnicity estimates. Another entity, GedMatch isn’t a testing vendor, but does provide multiple ethnicity tools if you upload your results from the other vendors. To get an idea of how widely the results vary, you can see the results of my tests at the different vendors here and here.

My initial DNA ethnicity test, in 2002, reported that I was 25% Native American, but I’m clearly not. It’s evident to me now, but it wasn’t then. That early ethnicity test was the dinosaur ages in genetic genealogy, but it did send me on a quest through genealogical records to prove that my family member was indeed Native. My father clearly believed this, as did the rest of the family. One of my early memories when I was about four years old was attending a (then illegal) powwow with my Dad.

In order to prove that Elizabeth Vannoy, that great-grandmother, was Native I asked a cousin who descends from her matrilineally to take a mitochondrial DNA test that would unquestionably provide the ethnicity of her matrilineal line – that of her mother’s mother’s mother’s direct line. If she was Native, her haplogroup would be a derivative either A, B, C, D or X. Her mitochondrial DNA was European, haplogroup J, clearly not Native, so Elizabeth Vannoy was not Native on that line of her family. Ok, maybe through her dad’s line then. I was able to find a Vanoy male descendant of her father, Joel Vannoy, to test his Y DNA and he was not Native either. Rats!

Tracking Elizabeth Vannoy’s genealogy back in time provided no paper-trail link to any Native ancestors, but there were and are still females whose surnames and heritage we don’t know. Were they Native or part Native? Possibly. Nothing precludes it, but nothing (yet) confirms it either.

Unexpected Results

DNA testing is notorious for unveiling unexpected results. Adoptions, unknown parents, unexpected ethnicities, previously unknown siblings and half-siblings and more.

Ethnicity is often surprising and sometimes disappointing. People who expect Native American heritage in their DNA sometimes don’t find it. Why?

  • There is no Native ancestor
  • The Native DNA has “washed out” over the generations, but they did have a Native ancestor
  • We haven’t yet learned to recognize all of the segments that are Native
  • The testing company did not test the area that is Native

Not all vendors test the same areas of our DNA. Each major company tests about 700,000 locations, roughly, but not the same 700,000. If you’re interested in specifics, you can read more about that here.

50-50 Chance

Everyone receives half of their autosomal DNA from each parent.

That means that each parent contributes only HALF OF THEIR DNA to a child. The other half of their DNA is never passed on, at least not to that child.

Therefore, ancestral DNA passed on is literally cut in half in each generation. If your parent has a Native American DNA segment, there is a 50-50 chance you’ll inherit it too. You could inherit the entire segment, a portion of the segment, or none of the segment at all.

That means that if you have a Native ancestor 6 generations back in your tree, you share 1.56% of their DNA, on average. I wrote the article, Ancestral DNA Percentages – How Much of Them is in You? to explain how this works.

These calculations are estimates and use averages. Why? Because they tell us what to expect, on average. Every person’s results will vary. It’s entirely possible to carry a Native (or other ethnic) segment from 7 or 8 or 9 generations ago, or to have none in 5 generations. Of course, these calculations also presume that the “Native” ancestor we find in our tree was fully Native. If the Native ancestor was already admixed, then the percentages of Native DNA that you could inherit drop further.

Why Call Ethnicity an Estimate?

You’ve probably figured out by now that due to the way that DNA is inherited, your ethnicity as reported by the major testing companies isn’t an exact science. I discussed the methodology behind ethnicity results in the article, Ethnicity Testing – A Conundrum.

It is, however, a specialized science known as Population Genetics. The quality of the results that are returned to you varies based on several factors:

  • World Region – Ethnicity estimates are quite accurate at the continental level, plus Jewish – meaning African, Indo-European, Asian, Native American and Jewish. These regions are more different than alike and better able to be separated.
  • Reference Population – The size of the population your results are being compared to is important. The larger the reference population, the more likely your results are to be accurate.
  • Vendor Algorithm – None of the vendors provide the exact nature of their internal algorithms that they use to determine your ethnicity percentages. Suffice it to say that each vendor’s staff includes population geneticists and they all have years of experience. These internal differences are why the estimates vary when compared to each other.
  • Size of the Segment – As with all genetic genealogy, bigger is better because larger segments stand a better chance of being accurate.
  • Academic Phasing – A methodology academics and vendors use in which segments of DNA that are known to travel together during inheritance are grouped together in your results. This methodology is not infallible, but in general, it helps to group your mother’s DNA together and your father’s DNA together, especially when parents are not available for testing.
  • Parental Phasing – If your parents test and they too have the same segment identified as Native, you know that the identification of that segment as Native is NOT a factor of chance, where the DNA of each of your parents just happens to fall together in a manner as to mimic a Native segment. Parental phasing is the ability to divide your DNA into two parts based on your parent’s DNA test(s).
  • Two Chromosomes – You have two chromosomes, one from your mother and one from your father. DNA testing can’t easily separate those chromosomes, so the exact same “address” on your mother’s and father’s chromosomes that you inherited may carry two different ethnicities. Unless your parents are both from the same ethnic population, of course.

All of these factors, together, create a confidence score. Consumers never see these scores as such, but the vendors return the highest confidence results to their customers. Some vendors include the capability, one way or another, to view or omit lower confidence results.

Parental Phasing – Identical by Descent

If you’re lucky enough to have your parents, or even one parent available to test, you can determine whether that segment thought to be Native came from one of your parents, or if the combination of both of your parent’s DNA just happened to combine to “look” Native.

Here’s an example where the “letters” (nucleotides) of Native DNA for an example segment are shown at left. If you received the As from one of your parents, your DNA is said to be phased to that parent’s DNA. That means that you in fact inherited that piece of your DNA from your mother, in the case shown below.

That’s known as Identical by Descent (IBD). The other possibility is what your DNA from both of your parents intermixed to mimic a Native segment, shown below.

This is known as Identical by Chance (IBC).

You don’t need to understand the underpinnings of this phenomenon, just remember that it can happen, and the smaller the segment, the more likely that a chance combination can randomly happen.

Elizabeth Warren’s Genealogy

Elizabeth Warren’s genealogy, is reported to the 5th generation by WikiTree.

Elizabeth’s mother, Pauline Herring’s line is shown, at WikiTree, as follows:

Notice that of Elizabeth Warren’s 16 great-great-great grandparents on her mother’s side, 9 are missing.

Paper trail being unfruitful, Elizabeth Warren, like so many, sought to validate her family story through DNA testing.

Elizabeth Warren’s DNA Results

Elizabeth Warren didn’t test with one of the major vendors. Instead, she went directly to a specialist. That’s the equivalent of skipping the family practice doctor and going to the Mayo Clinic.

Elizabeth Warren had test results interpreted by Dr. Carlos Bustamante at Stanford University. You can read the actual report here and I encourage you to do so.

From the report, here are Dr. Bustamante’s credentials:

Dr. Carlos D. Bustamante is an internationally recognized leader in the application of data science and genomics technology to problems in medicine, agriculture, and biology. He received his Ph.D. in Biology and MS in Statistics from Harvard University (2001), was on the faculty at Cornell University (2002-9), and was named a MacArthur Fellow in 2010. He is currently Professor of Biomedical Data Science, Genetics, and (by courtesy) Biology at Stanford University. Dr. Bustamante has a passion for building new academic units, non-profits, and companies to solve pressing scientific challenges. He is Founding Director of the Stanford Center for Computational, Evolutionary, and Human Genomics (CEHG) and Inaugural Chair of the Department of Biomedical Data Science. He is the Owner and President of CDB Consulting, LTD. and also a Director at Eden Roc Biotech, founder of Arc-Bio (formerly IdentifyGenomics and BigData Bio), and an SAB member of Imprimed, Etalon DX, and Digitalis Ventures among others.

He’s no lightweight in the study of Native American DNA. This 2012 paper, published in PLOS Genetics, Development of a Panel of Genome-Wide Ancestry Informative Markers to Study Admixture Throughout the Americas focused on teasing out Native American markers in admixed individuals.

From that paper:

Ancestry Informative Markers (AIMs) are commonly used to estimate overall admixture proportions efficiently and inexpensively. AIMs are polymorphisms that exhibit large allele frequency differences between populations and can be used to infer individuals’ geographic origins.

And:

Using a panel of AIMs distributed throughout the genome, it is possible to estimate the relative ancestral proportions in admixed individuals such as African Americans and Latin Americans, as well as to infer the time since the admixture process.

The methodology produced results of the type that we are used to seeing in terms of continental admixture, shown in the graphic below from the paper.

Matching test takers against the genetic locations that can be identified as either Native or African or European informs us that our own ancestors carried the DNA associated with that ethnicity.

Of course, the Native samples from this paper were focused south of the United States, but the process is the same regardless. The original Native American population of a few individuals arrived thousands of years ago in one or more groups from Asia and their descendants spread throughout both North and South America.

Elizabeth’s request, from the report:

To analyze genetic data from an individual of European descent and determine if there is reliable evidence of Native American and/or African ancestry. The identity of the sample donor, Elizabeth Warren, was not known to the analyst during the time the work was performed.

Elizabeth’s test included 764,958 genetic locations, of which 660,173 overlapped with locations used in ancestry analysis.

The Results section says after stating that Elizabeth’s DNA is primarily (95% or greater) European:

The analysis also identified 5 genetic segments as Native American in origin at high confidence, defined at the 99% posterior probability value. We performed several additional analyses to confirm the presence of Native American ancestry and to estimate the position of the ancestor in the individual’s pedigree.

The largest segment identified as having Native American ancestry is on chromosome 10. This segment is 13.4 centiMorgans in genetic length, and spans approximately 4,700,000 DNA bases. Based on a principal components analysis (Novembre et al., 2008), this segment is clearly distinct from segments of European ancestry (nominal p-value 7.4 x 10-7, corrected p-value of 2.6 x 10-4) and is strongly associated with Native American ancestry.

The total length of the 5 genetic segments identified as having Native American ancestry is 25.6 centiMorgans, and they span approximately 12,300,000 DNA bases. The average segment length is 5.8 centiMorgans. The total and average segment size suggest (via the method of moments) an unadmixed Native American ancestor in the pedigree at approximately 8 generations before the sample, although the actual number could be somewhat lower or higher (Gravel, 2012 and Huff et al., 2011).

Dr. Bustamante’s Conclusion:

While the vast majority of the individual’s ancestry is European, the results strongly support the existence of an unadmixed Native American ancestor in the individual’s pedigree, likely in the range of 6-10 generations ago.

I was very pleased to see that Dr. Bustamante had included the PCA (Principal Component Analysis) for Elizabeth’s sample as well.

PCA analysis is the scientific methodology utilized to group individuals to and within populations.

Figure one shows the section of chromosome 10 that showed the largest Native American haplotype, meaning DNA block, as compared to other populations.

Remember that since Elizabeth received a chromosome from BOTH parents, that she has two strands of DNA in that location.

Here’s our example again.

Given that Mom’s DNA is Native, and Dad’s is European in this example, the expected results when comparing this segment of DNA to other populations is that it would look half Native (Mom’s strand) and half European (Dad’s strand.)

The second graphic shows Elizabeth’s sample and where it falls in the comparison of First Nations (Canada) and Indigenous Mexican individuals. Given that Elizabeth’s Native ancestor would have been from the United States, her sample falls where expected, inbetween.

Let’s take a look at some of the questions being asked.

Questions and Answers

I’ve seen a lot of misconceptions and questions regarding these results. Let’s take them one by one:

Question – Can these results prove that Elizabeth is Cherokee?

Answer – No, there is no test, anyplace, from any lab or vendor, that can prove what tribe your ancestors were from. I wrote an article titled Finding Your American Indian Tribe Using DNA, but that process involves working with your matches, Y and mitochondrial DNA testing, and genealogy.

Q – Are these results absolutely positive?

A – The words “absolutely positive” are a difficult quantifier. Given the size of the largest segment, 13.4 cM, and that there are 5 Native segments totaling 25.6 cM, and that Dr. Bustamante’s lab performed the analysis – I’d say this is as close to “absolutely positive” as you can get without genealogical confirmation.

A 13.4 cM segment is a valid segment that phases to parents 98% of the time, according to Philip Gammon’s work, here, and 99% of the time in my own analysis here. That indicates that a 13.4 cM segment is very likely a legitimately ancestral segment, not a match by chance. The additional 4 segments simply increase the likelihood of a Native ancestor. In other words, for there NOT to be a Native ancestor, all 5 segments, including the large 13.4 cM segment would have to be misidentified by one of the premier scientists in the field.

Q – What did Dr. Bustamante mean by “evidence of an unadmixed Native American ancestor?”

A – Unadmixed means that the Native person was fully Native, meaning not admixed with European, Asian or African DNA. Admixture, in this context, means that the individual is a mixture of multiple ethnic groups. This is an important concept, because if you discover that your ancestor 4 generations ago was a Cherokee tribal member, but the reality was that they were only 25% Native, that means that the DNA was already in the process of being divided. If your 4th generation ancestor was fully Native, you would receive about 6.25% of their DNA which would be all Native. If they were only 25% Native, that means that while you will still receive about 6.25% of their DNA but only one fourth of that 6.25% is possibly Native – so 1.56%. You could also receive NONE of their Native DNA.

Q – Is this the same test that the major companies use?

A – Yes and no. The test itself was probably performed on the same Illumina chip platform, because the chips available cover the markers that Bustamante needed for analysis.

The major companies use the same reference data bases, plus their own internal or private data bases in addition. They do not create PCA models for each tester. They do use the same methodology described by Dr. Bustamante in terms of AIMs, along with proprietary algorithms to further define the results. Vendors may also use additional internal tools.

Q – Did Dr. Bustamante use more than one methodology in his analysis? What if one was wrong?

A – Yes, he utilized two different methodologies whose results agreed. The global ancestry method evaluates each location independently of any surrounding genetic locations, ignoring any correlation or relationship to neighboring DNA. The second methodology, known as the local ancestry method looks at each location in combination with its neighbors, given that DNA pieces are known to travel together. This second methodology allows comparisons to entire segments in reference populations and is what allows the identification of complete ancestral segments that are identified as Native or any other population.

Q – If Elizabeth’s DNA results hadn’t shown Native heritage, would that have proven that she didn’t have Native ancestry?

A – No, not definitively, although that is a possible reason for ethnicity results not showing Native admixture. It would have meant that either she didn’t have a Native ancestor, the DNA washed out, or we cannot yet detect those segments.

Q – Does this qualify Elizabeth to join a tribe?

A – No. Every tribe defines their own criteria for membership. Some tribes embrace DNA testing for paternity issues, but none, to the best of my knowledge, accept or rely entirely on DNA results for membership. DNA results alone cannot identify a specific tribe. Tribes are societal constructs and Native people genetically are more alike than different, especially in areas where tribes lived nearby, fought and captured other tribe’s members.

Q – Why does Dr. Bustamante use words like “strong probability” instead of absolutes, such as the percentages shown by commercial DNA testing companies?

A – Dr. Bustamante’s comments accurately reflect the state of our knowledge today. The vendors attempt to make the results understandable and attractive for the general population. Most vendors, if you read their statements closely and look at your various options indicate that ethnicity is only an estimate, and some provide the ability to view your ethnicity estimate results at high, medium and low confidence levels.

Q – Can we tell, precisely, when Elizabeth had a Native ancestor?

A – No, that’s why Dr. Bustamante states that Elizabeth’s ancestor was approximately 8 generations ago, and in the range of 6-10 generations ago. This analysis is a result of combined factors, including the total centiMorgans of Native DNA, the number of separate reasonably large segments, the size of the longest segment, and the confidence score for each segment. Those factors together predict most likely when a fully Native ancestor was present in the tree. Keep in mind that if Elizabeth had more than one Native ancestor, that too could affect the time prediction.

Q – Does Dr. Bustamante provide this type of analysis or tools for the general public?

A – Unfortunately, no. Dr. Bustamante’s lab is a research facility only.

Roberta’s Summary of the Analysis

I find no omissions or questionable methods and I agree with Dr. Bustamante’s analysis. In other words, yes, I believe, based on these results, that Elizabeth had a Native ancestor further back in her tree.

I would love for every tester to be able to receive PCA results like this.

However, an ethnicity confirmation isn’t all that can be done with Elizabeth’s results. Additional tools and opportunities are available outside of an academic setting, at the vendors where we test, using matching and other tools we have access to as the consuming public.

We will look at those possibilities in a second article, because Elizabeth’s results are really just a beginning and scratch the surface. There’s more available, much more. It won’t change Elizabeth’s ethnicity results, but it could lead to positively identifying the Native ancestor, or at least the ancestral Native line.

Join me in my next article for Possibilities, Wringing the Most Out of Your DNA Ethnicity Test.

In the mean time, you might want to read my article, Native American DNA Resources.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

DNA Painter – Touring the Chromosome Garden

This is the third article in a series about DNA Painter. To know DNA Painter is to love DNA Painter! Trust me!

The first two articles are:

The Chromosome Sudoku article introduces you to DNA Painter, it’s purpose and how to use the tool. The Mining Vendor Data article illustrates exactly how to find the segments you can paint from each of the main autosomal testing vendors and GedMatch.

This article is a leisurely tour through my colorful chromosome garden so that, together, we can see examples of how to utilize the information that chromosome painting unveils.

Chromosome painting can do amazing things: walk you back generations, show visual phasing…and reveal that there’s a mistake someplace, too.

If you’re not willing to be wrong and reconsider, this might not be the field for you😊

Automatic Triangulation

Chromosome painting automatically mathematically triangulates your DNA and in a much easier way than the old spreadsheet method. In fact, triangulation just happens, effortlessly IF you can determine which side is maternal and which side is paternal. Of course, you’ll always want to check to be sure that your matches also match each other. if not, then that’s an indication that maybe one or both are identical by chance.

The definition of triangulation in this context means:

  • To find a common segment
  • Of reasonable size (generally 7cM or over)
  • That is confirmed to a common ancestor with at least two other individuals
  • Who are not close family

Close family generally means parents, siblings, sometimes grandparents, although parents and grandparents can certainly be used to verify that the match is valid. The best triangulation situation is when you match those two other people through a second child, meaning siblings of your ancestor.

Different matches, depending on the circumstances, have a different level of value to you as a genealogist. In other words, some are more solid than others.

The X chromosome has special matching and triangulation rules, so we’ll talk about that when we get to that section.

Don’t think of chromosome painting as “doing” triangulation, because triangulation is a bonus of chromosome painting, and it just happens, automatically, so long as you can confirm that the segment is from either your maternal or paternal line.

What does triangulation look like in DNA Painter?

Here’s what my painted chromosome 15 looks like.

Here, I’ve drawn boxes around the areas that are triangulated. Actually, I made a small mistake and omitted one grey bar that’s also part of a second triangulation group. Can you spot it? Hint – look at the grey bars at far right in the overlapping triangulation group boxes where the red arrow is pointing. The box below should extend upwards to incorporate part of that top grey bar too.

Triangulation are those several segments piled up on top of each other. It means they match you at the same address on either the maternal or paternal chromosome. That’s good, but it’s not the same as an official “pileup area.”

Ok, so what’s a pileup area?

Pileup Areas

Certain locations in the human genome have been designated as pileup regions based on the fact that many people will match on these segments, not necessarily because they share a common relatively recent ancestor, but instead because a particular segment has a very high frequency in the general human population, or in the population of a specific region. Translated, this means that the segment might not be relevant to genealogy.

But before going too far with this discussion, it doesn’t mean that matches in pileup regions aren’t relevant to genealogy – just consider it a caution sign.

Aside from chromosome 6, which includes the HLA region, I’ve always been rather suspicious of pileup regions, because they don’t seem to hold true for me. You can view a chart that I assembled of the known pileup regions here.

DNA Painter generously includes pileup region warnings, in essence, along a chromosome bar at the top indicating “shared” or “both.”

Please note that you can click to enlarge any image.

Pileups regions are indicated by the grey hashed region at right. In my case, on chromosome 1, the pileup region isn’t piled up at all, on either the paternal (blue) chromosome or the maternal (pink) chromosome.

As you can see, I have exactly one match on the maternal side (green) and one (gold) on the paternal side (with a smidgen of a second grey match) as well, with both extending significantly beyond the pileup region. There is no reason to suspect that these gold and green matches aren’t valid.

If I saw many more matches in a pileup region than elsewhere, or many small matches, or DNA that was supposed to be from multiple ancestors not in the same line, then I’d have to question whether a pileup region was responsible.

Stacked Segments

DNA Painter provides you with the opportunity to see which of your ancestors’ segments stack. Stacking is a very important concept of DNA painting.

Before we talk about stacking, notice that the legend for which segments are color coded to specific ancestors is located at right. You can also click on the little grey box beside “Shared or Both,” at left, to show the match names beside the segments.  This is very useful when trying to analyze the accuracy of the match.

I wish DNA Painter offered an option to paint the ancestor’s names beside the segments. Maybe in V2. It’s really difficult to complain about anything because this tool is both free and awesome.

I’m using Powerpoint to label this group of stacked matches for this example.

This is a situation where I know my pedigree chart really well, so I know immediately upon looking at this stacked segment group who this piece of DNA descends from.

Here’s my pedigree chart that corresponds to the stacked segment.

We attribute each DNA segment to a couple initially based on who we match. In this case, that’s William George Estes and Ollie Bolton, my grandparents. The DNA remains attributed to them until we have evidence of which individual person in the couple received that DNA from their ancestors and passed it on to their descendant.

Therefore, the pink people are the half of the couple who we now know (thanks to DNA Painter) did NOT contribute that DNA segment, because we can track the DNA directly through the yellow line until we’re once again to another genetic brick wall couple.

My father is listed at left, and the DNA path runs back to William Crumley the second and his unknown wife who is haplogroup H2a1, the yellow couple at far right. How cool is this? One of those ancestors (or a combined segment from both) has been passed intact to me today. This is not a trivial segment either at 23.3 cM. I would not expect a segment passed to 5th cousins to be that large, but it is!

Also, note that the grey segment of DNA from Lazarus Estes (1848-1918) and Elizabeth Vannoy (1847-1918) is sitting slightly to the left of the dark blue segment from William Crumley III, so part or all of the grey or blue segment may originate with a different ancestor. Perhaps we’ll know more when additional people test and match on this same segment.

Double Related

I have one person who is related to me through two different lines. I need a way to determine which line (or both) our common DNA segment descends from.

I painted the segment for both of our common ancestor couples. The pink is George Dodson (1702-1770) & Margaret Dagord. The bright blue segment is William Crumley III (1788-1859) & Lydia Brown.

Those two lines don’t converge, at least not that we know of.

Now, as I map additional people, I’ll watch this segment for a tie breaker match between the two ancestors. The gold is not a tie breaker because that’s my grandparents who are downstream of both the pink and blue ancestors.

Painted Ethnicity

23andMe does us the favor of painting our ethnicity segments and allowing us to download a file with those segments. Conversely, DNA Painter does us the favor of allowing us to paint that entire file at once.

I already know my two Native segments on chromosome 1 and 2 descend through my mother, because her DNA is Native in exactly the same location. In other words, in this case, my ethnicity segment does in fact phase to my mother, although that’s not always the case with ethnicity.

Multiple Acadian ancestors are also proven to be Native by both genealogical records and maternal and/or paternal haplogroups.

Therefore, I’ve painted my Native segments on my mother’s side in order to determine exactly from which ancestor(s) those Native segment descend.

Confirming Questionable Ancestors

One very long-standing mystery that seemed almost unsolvable was the identity of the parents of Elijah Vannoy (1784->1850). We know he was the son of one of 4 Vannoy brothers living in Wilkes County, NC. Two were eliminated by existing Bibles and other records, but the other two remained candidates in spite of sifting through every available record and resource. We were out of luck unless DNA came to the rescue. Y DNA confirmed that Elijah was descended from one of the Vannoy males, but didn’t shed light on which one.

I decided that the wives would be the key, since we knew the identity of all four wives, thankfully. Of course, that means we’d be using autosomal DNA to attempt to gather more information.

I entered one candidate couple at Ancestry as Elijah’s parents – the one I felt most likely based on tax records and other criteria – Daniel Vannoy and Sarah Hickerson.  I also entered Sarah’s parents, Charles Hickerson (c 1725-<1793) and Mary Lytle.

I began getting matches to people who descend from Charles Hickerson and Mary Lytle through children other than Sarah.

The grey segment is from a descendant of Lazarus Estes & Elizabeth Vannoy. The salmon segments are from descendants of Charles Hickerson and Mary Lytle.

These segments aren’t small, 12.8 and 16.1 cM, so I’m fairly confident that these multiple segments in combination with the Elizabeth Vannoy segment do indeed descend from Charles Hickerson and Mary Lytle.

At Ancestry, I have 5 matches to Charles Hickerson and Mary Lytle through three of their children. However, only two of the individuals has transferred their results to either Family Tree DNA, MyHeritage or GedMatch where segment information is available to customers.

Finally, the thirty year old mystery is solved!

Shifting, Sliding, Offset or Staggered Segment Groups

Occasionally, you can prove an entire large segment by groups of shifting or sliding segments, sometimes referred as offset or staggered segments.

The entire bright pink region is inherited from Jacob Lentz (1783-1870) and Fredericka Reuhl (1788-1863.) However, it’s not proven by one individual but by a combination of 6 people whose segments don’t all overlap with each other.  The top two do match very closely with me and each other, then the third spans the two groups. The bottom 3 and part of the middle segment match very closely as well.

I can conclude that the entire dark pink region from left to right descends from Jacob and Fredericka.

Two Matches – 7 Generations

Two matches is all it took to identify this segment back to George Dodson and Margaret Dagord.

The mustard match is to my grandparents (22cM), and the pink match is to George Dodson (1702-1770) and his wife (22cM) – 7 generations. These people also match each other.

Additional matches would make this evidence stronger, although a 22cM triangulated match is very significant alone. Future might also suggest ancestors further back in time.

First Chromosome Fully Mapped

I actually have chromosome 5 entirely mapped to confirmed ancestors. I’m so excited.

Uh Oh – Something’s Wrong

I found a stack that clearly indicates something is wrong.  The question is, what?

The mustard represents my paternal grandparents, so these segments could have come through either of them, although on the pedigree chart below, we can see that this came through my grandfathers line..

There is only a small overlap with the magenta (Nicholas Speak 1782-1852 and Sarah Faires 1786-1865) and green (James Crumley 1711-1764 and Catherine c1712-c1790,) which could be by chance given that the Nicholas segment is 7.5 cM, so I’m leaving the magenta out of the analysis.

However, the rest of these segments overlap each other significantly, even though they are stepped or staggered.

As you can see from the colors on the pedigree chat, it’s impossible for the green segment to descend from the same ancestor as the purple segment. The purple and orange confirm that branch of the tree, but the red cannot be from the same ancestor or the same line as the green ancestor.

I suspect that the purple and orange line is correct, because there are 4 segments from different people with the same ancestral line.

This means that we have one of the following situations with the red and green segments:

  • The smaller segments are incorrect, false positives, meaning matching by chance. The green segment is 14 cM, so quite large to match by chance. The red segment is 10 cM. Possible, but not probable.
  • The segments are population-based matches, so appear in all 3 lines. Possible, technically, but also not probable due to the segment size.
  • The segments are genuine matches, and one of the lines is also found in one of the other lines, upstream. This is possible, but this would have to be the case with both the red and green lines. To continue to weigh this possibility, I’ll be watching for similar situations with these same ancestors.
  • Some combination of the above.

I need more matches on this segment for further clarity.

Visual Phasing – Crossovers

A crossover point is where the DNA on one side of a demarcation line is descended from one ancestor and the DNA on the other side is descended from another ancestor, represented by the pink and blue halves of the segment, below.

Crossovers occur when the DNA is combined from two different ancestors when it is passed to the child. In other words, a chunk of mom’s ancestors’ DNA is contributed by mom and a chunk of dad’s ancestors’ DNA is contributed as well. The seam between different ancestor’s DNA pieces is called a crossover.

In this example, the brown lines confirmed by several testers to be from Henry Bolton (c1759-1846) and Nancy Mann (c1780-1841) is shown with a very specific left starting point, all in a vertical line. It looks for all the world like this is a crossover point. The DNA to the left would have been contributed by another, as yet unidentified, ancestor.

The gold lines above are matches from more recent generations.

Naming Those Unnamed Acadians

My Acadian ancestry is hopelessly intertwined, but chromosome painting may in fact provide me with some prayer of unraveling this ball of twine. Eventually.

When I know that someone is Acadian, but I can’t tell which of many lines I connect through, I add them as “Acadian Undetermined.”

There’s a lot of Acadian DNA, because it’s an endogamous population and they just keep passing the same segments around and around in a very limited population.

On my maternal chromosome, all of the olive green is “Acadian Undetermined.”  However, that blue segment in the stack is Rene de Forest (1670-1751) and Francoise Dugas (1678->1751).

In essence, this one match identified all of the DNA of the other people who are now simply a row in the Acadian Undetermined stack. Now I need to go back and peruse the trees of these individuals to determine if they descend form this line, or a common ancestor of this line, or if (some of) these matches are a matter of endogamy.

Endogamous matches can be population based, meaning that you do match each other, but it’s because you share so much of the same DNA because you have small pieces of many common ancestors – not because a particular segment comes from one specific ancestor. You can also share part of your DNA from Mom’s side and part from Dad’s side, because both of your parents descend from a common population and not because the entire segment comes from any particular ancestor.

On some long cold winter weekend, I’ll go through and map all of the trees of my Acadian matches to see what I can unravel. I just love matches with trees. You just can’t do something like this otherwise.

Of course, those Acadians (and other endogamous populations) can be tricky, no matter what, one click up from a needle in a haystack.

Acadian Endogamy Haystack on Steroids

At first, our haystack looks like we’ve solved the mystery of the identity of the stack.  However, we soon discover that maybe things aren’t as neat and tidy as we think.

Of course, the olive green is Acadian Undetermined, but the three other colored segments are:

  • Pink – Guillaume Blanchard (1650-1715/17) & Huguette Goujon (c1647-1717)
  • Brown/Pink – Francois Broussard (c1653-1716) & Catherine Richard (c1663-1748)
  • Coffee – Daniel Garceau (1707-1772) & Anne Doucet (1713-1791)

Looking at the pedigree chart, we find two of these couples in the same lineage, so all is good, until we find the third, pink, couple, at the bottom.

Clearly, this segment can’t be in two different lines at once, so we have a problem.  Or do we?

Working the pink troublesome lines on back, we make a discovery.

We find a Blanchard line consisting of Guilluame Blanchard born circa 1590 and Huguette Poirier also born circa 1690.

Interesting. Let’s compare the Guillaume Blanchard and Huguette Goujon line. Is this the same couple, but with a different surname for her?

No, as it turns out, Guillaume Blanchard that married Huguette Goujon was the grandson of Guilluame Blanchard and Huguette Poirier. That haystack segment of DNA was passed down through two different lines, it appears, to converge in three descendants – me, the descendant of the pink segment couple and the descendant of the brown/burgundy segment couple. This segment reaches back in time to the birth of either Guilluame Blanchard or Huguette Poirier in 1590, someplace in France, rode over on the ship to Port Royal in the very early 1600s, probably before Jamestown was settled, and has been kicking around in my ancestors and their descendants ever since.

This 18 or so cM ancestral segment is buried someplace at Port Royal, Nova Scotia, but lives on in me and several other people through at least two divergent lines.

The X Chromsome

Several vendors don’t report the X chromosome segments. I do use X segments from those who do, but I utilize a different threshold because the SNP density is about half of that on the other chromosomes. In essence, you need a match twice as large to be equivalent to a match on another chromosome..

Generally, I don’t rely on segments below 10 for anyone, and I generally only use segments over 14cM and no less than 500 SNPs.

Having just said that, I have painted a few smaller segments, because I know that if they are inaccurate, they are very easy to delete. They can remain in speculative mode. The default for DNAPainter and that’s what I use.

The great thing about the X chromosome is that because of it’s special inheritance path, you can sometimes push these segments another 2 generations back in time.

Let’s use an X chromosome match in conjunction with my X fan chart printed through Charting Companion.

On the paternal X, I inherited the gold segment from the couple, William George Estes (1873-1971) & Ollie Bolton (1874-1955.) However, since my father didn’t inherit an X from William George Estes (because my father inherited the Y from his father,) that X segment has to be from Ollie Bolton, and therefore from her parents Joseph Bolton (1853-1920) and Margaret Claxton (1851-1920.)

The segment from Lazarus Estes (1848-1918) and Elizabeth Vannoy (1847-1918) that’s 14 cM is false. It can’t descend from that couple. Same for the 7.5 cM from Jotham Brown (c1740-c1799) & Phoebe unk (c1747-c1803.) That segment’s false too. The green 48 cM segment from Samuel Claxton (1827-1876) and Elizabeth Speak (1832-1907)?  That segment’s good to go!

On my mother’s side, there’s a 7.8 cM Acadian Undetermined, which must be false, because Curtis Benjamin Lore (1856-1909) did not inherit an X chromosome from his Acadian father, Antoine Lore (1805-1862/67.)  Therefore, my X chromosome has no Acadian at all. I never realized that before, and it makes my X chromosome MUCH easier.

How about that light green 33cM segment from Antoine Lore (1805-1862/67) & Rachel Hill (1814/15-1870/80)? That segment must come from Rachel Hill, so it’s pushed back another generation to Joseph Hill (1790-1871) and Nabby Hall (1792-1874.)

I love the X chromosome because when you find a male in the line, you automatically get bumped two more generations back to his mother’s parents. It’s like the X prize for genetic genealogy, pardon the pun!

Adoptees

Some adoptees are lucky and receive close matches immediately. Others, not so much and the search is a long process.

If you’re an adoptee trying to figure out how your matches connect together, use in-common-match groupings to cluster matches together, then paint them in groups.  Utilize the overlapping segments in order to view their trees, looking for common surnames. Always start with the groups with the longest segments and the most matches. The larger the match, the more likely you are to be able to find a connection in a more recent generation. The more matches, the more likely you are to be able to spot a common surname (or two.)

Painting can speed this process significantly.

Much More Than Painting

I hope this tour through my colorful chromosomes has illustrated how much fun analysis can be. You’ll have so much fun that you won’t even realize you’re triangulating, phasing and all of those other difficult words.

If you have something you absolutely have to do, set an alarm – or you’ll forget all about it. Voice of experience here!

So, go and find some segments to paint so all of these exciting things can happen to you too!

How far back will you be able to identity a segment to a specific ancestor?  How about a triangulated segment? An X segment?

Have fun!!! Don’t forget to eat!

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Who Tests the X Chromosome?

Recently, someone asked which of the major DNA testing companies test the X chromosome and which ones use the X in matching. How does this difference influence the quality of our matches?

Vendor X in Download File Uses X in Matching X Included in Total cM Count
23andMe Yes Yes Yes
Family Tree DNA Yes Yes (if have a match on another chromosome) No
Ancestry Yes *No No
MyHeritage Yes No No
GedMatch N/A Separately No

*If Ancestry did utilize the X in matching, it wouldn’t benefit customers because Ancestry does not show segment information by chromosome.  In other words, no chromosome browser.

Family Tree DNA includes any size X match IF and only if the two people already match on a different chromosome.

GedMatch, of course, isn’t a vendor who does DNA testing, so they don’t provide download files.  They are solely on the receiving end.

X CentiMorgan Counts

Due to variations in the way vendors calculate matches and total cM counts, your mileage may vary a bit.

In other words, the 23andMe cM total, if an X match is involved, may be slightly more than a match between the same two people at Family Tree DNA, where the X match cM is not included in the cM total.

Conversely, you won’t show an X match with someone at Family Tree DNA if there isn’t also another segment on a different chromosome that matches.

In general, due to the thin spread of SNPs on the X chromosome, you will need, on average, a cM match that is twice as large as on other chromosomes to be considered of equal weight.

In other words, a 10 cM match on the X chromosome would only be genealogically equivalent to approximately a 5 cM match on any other chromosome.

X matches really can’t be evaluated by the same rules as other chromosomes due both to their SNP paucity and their inheritance path, which is why most vendors don’t include those segments in the total cM count.

X Matches

While including the X chromosome cM count is problematic, X matching can be a huge benefit because of the unique inheritance path of the X chromosome.

In the article, X Marks the Spot, we discussed the inheritance path of the X chromosome for both males and females. Females inherit an X chromosome from both father and mother, which recombines just like chromosomes 1-22.  However, men only inherit an X from their mother, because they inherit a Y from their father instead of the X.  Therefore, males will only inherit an X from their mother, and females will only inherit their father’s mother’s X chromosome.

Charting Companion software works with your genealogy software of choice to produce a lovely fan chart where the contributors of my X chromosome are charted in color, above. You can read more about Charting Companion here.

The great news is that if you and a match share a significant portion of the X chromosome, meaning more than 15 cM which reduces the likelihood of an identical by chance match, the common ancestor (on that segment) has to come from an ancestor in your direct X path.

I’m always excited to see with whom I share an X.  That piece of information alone helps me focus my ancestor detective efforts on a specific portion of my tree.

Some X segments can remain intact for generations and may be very old.  So don’t be surprised if the common ancestor of the X segment and another matching segment may not be the same ancestor.

Sorting by X

I wasn’t able to find a way to sort by X chromosome matches at 23andMe, but you can sort by the X at both Family Tree DNA and GedMatch.

At GedMatch, X matching shows on the one-to-many match page.  You can sort by either Total X cM or Largest X cM by using the up and down arrows, at right, below, in the X DNA columns.

After you identify an X match, be sure to run the X one-to-one match option to verify.

My GedMatch matches cause me to wonder if 23andMe is using a different reporting threshold for the X chromosome, because one of my matches at GedMatch is a close family member with no X match at 23andMe, but a total of 32 X cM and with a longest segment of 14 X cM at GedMatch.

That same individual matches me with the largest X segment of 14 cM at Family Tree DNA as well.

Family Tree DNA X Match Phasing

At Family Tree DNA, on your Family Finder matches page, just click on the X-Match header (at right, below) to bring all of your X matches to the top of your list.

If you have linked any kits of relatives to your tree, you will see numbers of phased kits on the maternal and paternal tabs with the red and blue male and female icons. In the example above, I have 3313 matches total, with 744 being paternal, 586 being maternal.

Next, click on the maternal or paternal tab to see only the people with X matches who match you on the  your maternal and paternal lines. Matches are automatically sorted into maternal and paternal “buckets” for you. Remember to check the size of the X match before deciding about relevance.

Who is your largest X match that you don’t already know?  Maybe you can find your common ancestor today.

Have fun!!!

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Which DNA Test is Best?

If you’re reading this article, congratulations. You’re a savvy shopper and you’re doing some research before purchasing a DNA test. You’ve come to the right place.

The most common question I receive is asking which test is best to purchase. There is no one single best answer for everyone – it depends on your testing goals and your pocketbook.

Testing Goals

People who want to have their DNA tested have a goal in mind and seek results to utilize for their particular purpose. Today, in the Direct to Consumer (DTC) DNA market space, people have varied interests that fall into the general categories of genealogy and medical/health.

I’ve approached the question of “which test is best” by providing information grouped into testing goal categories.  I’ve compared the different vendors and tests from the perspective of someone who is looking to test for those purposes – and I’ve created separate sections of this article for each interest..

We will be discussing testing for:

  • Ethnicity – Who Am I? – Breakdown by Various World Regions
  • Adoption – Finding Missing Parents or Close Family
  • Genealogy – Cousin Matching and Ancestor Search/Verification
  • Medical/Health

We will be reviewing the following test types:

  • Autosomal
  • Y DNA (males only)
  • Mitochondrial DNA

I have included summary charts for each section, plus an additional chart for:

  • Additional Vendor Considerations

If you are looking to select one test, or have limited funds, or are looking to prioritize certain types of tests, you’ll want to read about each vendor, each type of test, and each testing goal category.

Each category reports information about the vendors and their products from a different perspective – and only you can decide which of these perspectives and features are most important to you.

You might want to read this short article for a quick overview of the 4 kinds of DNA used for genetic genealogy and DTC testing and how they differ.

The Big 3

Today, there are three major players in the DNA testing market, not in any particular order:

Each of these companies offers autosomal tests, but each vendor offers features that are unique. Family Tree DNA and 23andMe offer additional tests as well.

In addition to the Big 3, there are a couple of new kids on the block that I will mention where appropriate. There are also niche players for the more advanced genetic genealogist or serious researcher, and this article does not address advanced research.

In a nutshell, if you are serious genealogist, you will want to take all of the following tests to maximize your tools for solving genealogical puzzles. There is no one single test that does everything.

  • Full mitochondrial sequence that informs you about your matrilineal line (only) at Family Tree DNA. This test currently costs $199.
  • Y DNA test (for males only) that informs you about your direct paternal (surname) line (only) at Family Tree DNA. This test begins at $169 for 37 markers.
  • Family Finder, an autosomal test that provides ethnicity estimates and cousin matching at Family Tree DNA. This test currently costs $89.
  • AncestryDNA, an autosomal test at Ancestry.com that provides ethnicity estimates and cousin matching. (Do not confuse this test with Ancestry by DNA, which is not the same test and does not provide the same features.) This test currently costs $99, plus the additional cost of a subscription for full feature access. You can test without a subscription, but nonsubscribers can’t access all of the test result features provided to Ancestry subscribers.
  • 23andMe Ancestry Service test, an autosomal test that provides ethnicity estimates and cousin matching. The genealogy version of this test costs $99, the medical+genealogy version costs $199.

A Word About Third Party Tools

A number of third party tools exist, such as GedMatch and DNAGedcom.com, and while these tools are quite useful after testing, these vendors don’t provide tests. In order to use these sites, you must first take an autosomal DNA test from a testing vendor. This article focuses on selecting your DNA testing vendor based on your testing goals.

Let’s get started!

Ethnicity

Many people are drawn to DNA testing through commercials that promise to ‘tell you who you are.” While the allure is exciting, the reality is somewhat different.

Each of the major three vendors provide an ethnicity estimate based on your autosomal DNA test, and each of the three vendors will provide you with a different result.

Yep, same person, different ethnicity breakdowns.

Hopefully, the outcomes will be very similar, but that’s certainly not always the case. However, many people take one test and believe those results wholeheartedly. Please don’t. You may want to read Concepts – Calculating Ethnicity Percentages to see how varied my own ethnicity reports are at various vendors as compared to my known genealogy.

The technology for understanding “ethnicity” from a genetic perspective is still very new. Your ethnicity estimate is based on reference populations from around the world – today. People and populations move, and have moved, for hundreds, thousands and tens of thousands of years. Written history only reaches back a fraction of that time, so the estimates provided to people today are not exact.

That isn’t to criticize any individual vendor. View each vendor’s results not as gospel, but as their opinion based on their reference populations and their internal proprietary algorithm of utilizing those reference populations to produce your ethnicity results.

To read more about how ethnicity testing works, and why your results may vary between vendors or not be what you expected, click here.

I don’t want to discourage anyone from testing, only to be sure consumers understand the context of what they will be receiving. Generally speaking, these results are accurate at the continental level, and less accurate within continents, such as European regional breakdowns.

All three testing companies provide additional features or tools, in addition to your ethnicity estimates, that are relevant to ethnicity or population groups.

Let’s look at each company separately.

Ethnicity – Family Tree DNA

Family Tree DNA’s ethnicity tool is called myOrigins and provides three features or tools in addition to the actual ethnicity estimate and associated ethnicity map.

Please note that throughout this article you can click on any image to enlarge.

On the myOrigins ethnicity map page, above, your ethnicity percentages and map are shown, along with two additional features.

The Shared Origins box to the left shows the matching ethnic components of people on your DNA match list. This is particularly useful if you are trying to discover, for example, where a particular minority admixture comes from in your lineage. You can select different match types, for example, immediate relatives or X chromosome matches, which have special inheritance qualities.

Clicking on the apricot (mitochondrial DNA) and green (Y DNA) pins in the lower right corner drops the pins in the locations on your map of the most distant ancestral Y and mitochondrial DNA locations of the individuals in the group you have selected in the Shared Origins match box. You may or may not match these individuals on the Y or mtDNA lines, but families tend to migrate in groups, so match hints of any kind are important.

A third unique feature provided by Family Tree DNA is Ancient Origins, a tool released with little fanfare in November 2016.

Ancient Origins shows the ancient source of your European DNA, based on genome sequencing of ancient DNA from the locations shown on the map.

Additionally, Family Tree DNA hosts an Ancient DNA project where they have facilitated the upload of the ancient genomes so that customers today can determine if they match these ancient individuals.

Kits included in the Ancient DNA project are shown in the chart below, along with their age and burial location. Some have matches today, and some of these samples are included on the Ancient Origins map.

Individual Approx. Age Burial Location Matches Ancient Origins Map
Clovis Anzick 12,500 Montana (US) Yes No
Linearbandkeramik 7,500 Stuttgart, Germany Yes Yes
Loschbour 8,000 Luxembourg Yes Yes
Palaeo-Eskimo 4,000 Greenland No No
Altai Neanderthal 50,000 Altai No No
Denisova 30,000 Siberia No No
Hinxton-4 2,000 Cambridgeshire, UK No No
BR2 3,200 Hungary Yes Yes
Ust’-Ishim 45,000 Siberia Yes No
NE1 7,500 Hungary Yes Yes

Ethnicity – Ancestry

In addition to your ethnicity estimate, Ancestry also provides a feature called Genetic Communities.

Your ethnicity estimate provides percentages of DNA found in regions shown on the map by fully colored shapes – green in Europe in the example above. Genetic Communities show how your DNA clusters with other people in specific regions of the world – shown with dotted clusters in the US in this example.

In my case, my ethnicity at Ancestry shows my European roots, illustrated by the green highlighted areas, and my two Genetic Communities are shown by yellow and red dotted regions in the United States.

My assigned Genetic Communities indicate that my DNA clusters with other people whose ancestors lived in two regions; The Lower Midwest and Virginia as well as the Alleghenies and Northeast Indiana.

Testers can then view their DNA matches within that community, as well as a group of surnames common within that community.

The Genetic Communities provided for me are accurate, but don’t expect all of your genealogical regions to be represented in Genetic Communities. For example, my DNA is 25% German, and I don’t have any German communities today, although ancestry will be adding new Genetic Communities as new clusters are formed.

You can read more about Genetic Communities here and here.

Ethnicity – 23andMe

In addition to ethnicity percentage estimates, called Ancestry Composition, 23andMe offers the ability to compare your Ancestry Composition against that of your parent to see which portions of your ethnicity you inherited from each parent, although there are problems with this tool incorrectly assigning parental segments.

Additionally, 23andMe paints your chromosome segments with your ethnic heritage, as shown below.

You can see that my yellow Native American segments appear on chromosomes 1 and 2.

In January 2017, 23andMe introduced their Ancestry Timeline, which I find to be extremely misleading and inaccurate. On my timeline, shown below, they estimate that my most recent British and Irish ancestor was found in my tree between 1900 and 1930 while in reality my most recent British/Irish individual found in my tree was born in England in 1759.

I do not view 23andMe’s Ancestry Timeline as a benefit to the genealogist, having found that it causes people to draw very misleading conclusions, even to the point of questioning their parentage based on the results. I wrote about their Ancestry Timeline here.

Ethnicity Summary

All three vendors provide both ethnicity percentage estimates and maps. All three vendors provide additional tools and features relevant to ethnicity. Vendors also provide matching to other people which may or may not be of interest to people who test only for ethnicity. “Who you are” only begins with ethnicity estimates.

DNA test costs are similar, although the Family Tree DNA test is less at $89. All three vendors have sales from time to time.

Ethnicity Vendor Summary Chart

Ethnicity testing is an autosomal DNA test and is available for both males and females.

Family Tree DNA Ancestry 23andMe
Ethnicity Test Included with $89 Family Finder test Included with $99 Ancestry DNA test Included with $99 Ancestry Service
Percentages and Maps Yes Yes Yes
Shared Ethnicity with Matches Yes No Yes
Additional Feature Y and mtDNA mapping of ethnicity matches Genetic Communities Ethnicity phasing against parent (has issues)
Additional Feature Ancient Origins Ethnicity mapping by chromosome
Additional Feature Ancient DNA Project Ancestry Timeline

 

Adoption and Parental Identity

DNA testing is extremely popular among adoptees and others in search of missing parents and grandparents.

The techniques used for adoption and parental search are somewhat different than those used for more traditional genealogy, although non-adoptees may wish to continue to read this section because many of the features that are important to adoptees are important to other testers as well.

Adoptees often utilize autosomal DNA somewhat differently than traditional genealogists by using a technique called mirror trees. In essence, the adoptee utilizes the trees posted online of their closest DNA matches to search for common family lines within those trees. The common family lines will eventually lead to the individuals within those common trees that are candidates to be the parents of the searcher.

Here’s a simplified hypothetical example of my tree and a first cousin adoptee match.

The adoptee matches me at a first cousin level, meaning that we share at least one common grandparent – but which one? Looking at other people the adoptee matches, or the adoptee and I both match, we find Edith Lore (or her ancestors) in the tree of multiple matches. Since Edith Lore is my grandmother, the adoptee is predicted to be my first cousin, and Edith Lore’s ancestors appear in the trees of our common matches – that tells us that Edith Lore is also the (probable) grandmother of the adoptee.

Looking at the possibilities for how Edith Lore can fit into the tree of me and the adoptee, as first cousins, we fine the following scenario.

Testing the known child of daughter Ferverda will then provide confirmation of this relationship if the known child proves to be a half sibling to the adoptee.

Therefore, close matches, the ability to contact matches and trees are very important to adoptees. I recommend that adoptees make contact with www.dnaadoption.com. The volunteers there specialize in adoptions and adoptees, provide search angels to help people and classes to teach adoptees how to utilize the techniques unique to adoption search such as building mirror trees.

For adoptees, the first rule is to test with all 3 major vendors plus MyHeritage. Family Tree DNA allows you to test with both 23andMe and Ancestry and subsequently transfer your results to Family Tree DNA, but I would strongly suggest adoptees test on the Family Tree DNA platform instead. Your match results from transferring to Family Tree DNA from other companies, except for MyHeritage, will be fewer and less reliable because both 23andMe and Ancestry utilize different chip technology.

For most genealogists, MyHeritage is not a player, as they have only recently entered the testing arena, have a very small data base, no tools and are having matching issues. I recently wrote about MyHeritage here. However, adoptees may want to test with MyHeritage, or upload your results to MyHeritage if you tested with Family Tree DNA, because your important puzzle-solving match just might have tested there and no place else. You can read about transfer kit compatibility and who accepts which vendors’ tests here.

Adoptees can benefit from ethnicity estimates at the continental level, meaning that regional (within continent) or minority ethnicity should be taken with a very large grain of salt. However, knowing that you have 25% Jewish heritage, for example, can be a very big clue to an adoptee’s search.

Another aspect of the adoptees search that can be relevant is the number of foreign testers. For many years, neither 23andMe, nor Ancestry tested substantially (or at all) outside the US. Family Tree DNA has always tested internationally and has a very strong Jewish data base component.

Not all vendors report X chromosome matches. The X chromosome is important to genetic genealogy, because it has a unique inheritance path. Men don’t inherit an X chromosome from their fathers. Therefore, if you match someone on the X chromosome, you know the relationship, for a male, must be from their mother’s side. For a female, the relationship must be from the mother or the father’s mother’s side. You can read more about X chromosome matching here.

Neither Ancestry nor MyHeritage have chromosome browsers which allow you to view the segments of DNA on which you match other individuals, which includes the X chromosome.

Adoptee Y and Mitochondrial Testing

In addition to autosomal DNA testing, adoptees will want to test their Y DNA (males only) and mitochondrial DNA.

These tests are different from autosomal DNA which tests the DNA you receive from all of your ancestors. Y and mitochondrial DNA focus on only one specific line, respectively. Y DNA is inherited by men from their fathers and the Y chromosome is passed from father to son from time immemorial. Therefore, testing the Y chromosome provides us with the ability to match to current people as well as to use the Y chromosome as a tool to look far back in time. Adoptees tend to be most interested in matching current people, at least initially.

Working with male adoptees, I have a found that about 30% of the time a male will match strongly to a particular surname, especially at higher marker levels. That isn’t always true, but adoptees will never know if they don’t test. An adoptee’s match list is shown at 111 markers, below.

Furthermore, utilizing the Y and mitochondrial DNA test in conjunction with autosomal DNA matching at Family Tree DNA helps narrows possible relatives. The Advanced Matching feature allows you to see who you match on both the Y (or mitochondrial) DNA lines AND the autosomal test, in combination.

Mitochondrial DNA tests the matrilineal line only, as women pass their mitochondrial DNA to all of their children, but only females pass it on. Family Tree DNA provides matching and advanced combination matching/searching for mitochondrial DNA as well as Y DNA. Both genders of children carry their mother’s mitochondrial DNA. Unfortunately, mitochondrial DNA is more difficult to work with because of the surname changes in each generation, but you cannot be descended from a woman, or her direct matrilineal ancestors if you don’t substantially match her mitochondrial DNA.

Some vendors state that you receive mitochondrial DNA with your autosomal results, which is only partly accurate. At 23andMe, you receive a haplogroup but no detailed results and no matching. 23andMe does not test the entire mitochondria and therefore cannot provide either advanced haplogroup placement nor Y or mitochondrial DNA matching between testers.

For additional details on the Y and Mitochondrial DNA tests themselves and what you receive, please see the Genealogy – Y and Mitochondrial DNA section.

Adoption Summary

Adoptees should test with all 4 vendors plus Y and mitochondrial DNA testing.

  • Ancestry – due to their extensive data base size and trees
  • Family Tree DNA – due to their advanced tools, chromosome browser, Y and mitochondrial DNA tests (Ancestry and 23andMe participants can transfer autosomal raw data files and see matches for free, but advanced tools require either an unlock fee or a test on the Family Tree DNA platform)
  • 23andMe – no trees and many people don’t participate in sharing genetic information
  • MyHeritage – new kid on the block, working through what is hoped are startup issues
  • All adoptees should take the full mitochondrial sequence test.
  • Male adoptees should take the 111 marker Y DNA test, although you can start with 37 or 67 markers and upgrade later.
  • Y and mitochondrial tests are only available at Family Tree DNA.

Adoptee Vendor Feature Summary Chart

Family Tree DNA Ancestry 23andMe MyHeritage
Autosomal DNA – Males and Females
Matching Yes Yes Yes Yes – problems
Relationship Estimates* Yes – May be too close Yes – May be too distant Yes – Matches may not be sharing Yes –  problematic
International Reach Very strong Not strong but growing Not strong Small but subscriber base is European focused
Trees Yes Yes No Yes
Tree Quantity 54% have trees, 46% no tree (of my first 100 matches) 56% have trees, 44% no tree or private (of my first 100 matches) No trees ~50% don’t have trees or are private (cannot discern private tree without clicking on every tree)
Data Base Size Large Largest Large – but not all opt in to matching Very small
My # of Matches on 4-23-2017 2,421 23,750 1,809 but only 1,114 are sharing 75
Subscription Required No No for partial, Yes for full functionality including access to matches’ trees, minimal subscription for $49 by calling Ancestry No No for partial, Yes for full functionality
Other Relevant Tools New Ancestor Discoveries
Autosomal DNA Issues Many testers don’t have trees Many testers don’t have trees Matching opt-in is problematic, no trees at all Matching issues, small data base size is problematic, many testers don’t have trees
Contact Methodology E-mail address provided to matches Internal message system – known delivery issues Internal message system Internal message system
X Chromosome Matching Yes No Yes No
Y-DNA – Males Only
Y DNA STR Test Yes- 37, 67, and 111 markers No No No
Y Haplogroup Yes as part of STR test plus additional testing available No Yes, basic level but no additional testing available, outdated haplogroups No
Y Matching Yes No No No
Advanced Matching Between Y and Autosomal Yes No No No
Mitochondrial DNA- Males and Females
Test Yes, partial and full sequence No No No
Mitochondrial DNA Haplogroup Yes, included in test No Yes, basic but full haplogroup not available, haplogroup several versions behind No
Advanced Matching Between Mitochondrial and Autosomal Yes No No No

Genealogy – Cousin Matching and Ancestor Search/Verification

People who want to take a DNA test to find cousins, to learn more about their genealogy, to verify their genealogy research or to search for unknown ancestors and break down brick walls will be interested in various types of testing

Test Type Who Can Test
Y DNA – direct paternal line Males only
Mitochondrial DNA – direct matrilineal line Males and Females
Autosomal – all lines Males and Females

Let’s begin with autosomal DNA testing for genealogy which tests your DNA inherited from all ancestral lines.

Aside from ethnicity, autosomal DNA testing provides matches to other people who have tested. A combination of trees, meaning their genealogy, and their chromosome segments are used to identify (through trees) and verify (through DNA segments) common ancestor(s) and then to assign a particular DNA segment(s) to that ancestor or ancestral couple. This process, called triangulation, then allows you to assign specific segments to particular ancestors, through segment matching among multiple people. You then know that when another individual matches you and those other people on the same segment, that the DNA comes from that same lineage. Triangulation is the only autosomal methodology to confirm ancestors who are not close relatives, beyond the past 2-3 generations or so.

All three vendors provide matching, but the tools they include and their user interfaces are quite different. 

Genealogy – Autosomal –  Family Tree DNA

Family Tree DNA entered DNA testing years before any of the others, initially with Y and mitochondrial DNA testing.

Because of the diversity of their products, their website is somewhat busier, but they do a good job of providing areas on the tester’s personal landing page for each of the products and within each product, a link for each feature or function.

For example, the Family Finder test is Family Tree DNA’s autosomal test. Within that product, tools provided are:

  • Matching
  • Chromosome Browser
  • Linked Relationships
  • myOrigins
  • Ancient Origins
  • Matrix
  • Advanced Matching

Unique autosomal tools provided by Family Tree DNA are:

  • Linked Relationships that allows you to connect individuals that you match to their location in your tree, indicating the proper relationship. Phased Family Matching uses these relationships within your tree to indicate which side of your tree other matches originate from.
  • Phased Family Matching shows which side of your tree, maternal, paternal or both, someone descends from, based on phased DNA matching between you and linked relationship matches as distant as third cousins. This allows Family Tree DNA to tell you whether matches are paternal (blue icon), maternal (red icon) or both (purple icon) without a parent’s DNA. This is one of the best autosomal tools at Family Tree DNA, shown below.

  • In Common With and Not In Common With features allow you to sort your matches in common with another individual a number of ways, or matches not in common with that individual.
  • Filtered downloads provide the downloading of chromosome data for your filtered match list.
  • Stackable filters and searches – for example, you can select paternal matches and then search for a particular surname or ancestral surname within the paternal matches.
  • Common ethnicity matching through myOrigins allows you to see selected groups of individuals who match you and share common ethnicities.
  • Y and mtDNA locations of autosomal matches are provided on your ethnicity map through myOrigins.
  • Advanced matching tool includes Y, mtDNA and autosomal in various combinations. Also includes matches within projects where the tester is a member as well as by partial surname.
  • The matrix tool allows the tester to enter multiple people that they match in order to see if those individuals also match each other. The matrix tool is, in combination with the in-common-with tool and the chromosome browser is a form of pseudo triangulation, but does not indicate that the individuals match on the same segment.

  • Chromosome browser with the ability to select different segment match thresholds to display when comparing 5 or fewer individuals to your results.
  • Projects to join which provide group interaction and allow individuals to match only within the project, if desired.

To read more about how to utilize the various autosomal tools at Family Tree DNA, with examples, click here.

Genealogy – Autosomal – Ancestry

Ancestry only offers autosomal DNA testing to their customers, so their page is simple and straightforward.

Ancestry is the only testing vendor (other than MyHeritage who is not included in this section) to require a subscription for full functionality, although if you call the Ancestry support line, a minimal subscription is available for $49. You can see your matches without a subscription, but you cannot see your matches trees or utilize other functions, so you will not be able to tell how you connect to your matches. Many genealogists have Ancestry subscriptions, so this is minimally problematic for most people.

However, if you don’t realize you need a subscription initially, the required annual subscription raises the effective cost of the test quite substantially. If you let your subscription lapse, you no longer have access to all DNA features. The cost of testing with Ancestry is the cost of the test plus the cost of a subscription if you aren’t already a subscriber.

This chart, from the Ancestry support center, provides details on which features are included for free and which are only available with a subscription.

Unique tools provided by Ancestry include:

  • Shared Ancestor Hints (green leaves) which indicate a match with whom you share a common ancestor in your tree connected to your DNA, allowing you to display the path of you and your match to the common ancestor. In order to take advantage of this feature, testers must link their tree to their DNA test. Otherwise, Ancestry can’t do tree matching.  As far as I’m concerned, this is the single most useful DNA tool at Ancestry. Subscription required.

  • DNA Circles, example below, are created when several people whose DNA matches also share a common ancestor. Subscription required.

  • New Ancestor Discoveries (NADs), which are similar to Circles, but are formed when you match people descended from a common ancestor, but don’t have that ancestor in your tree. The majority of the time, these NADs are incorrect and are, when dissected and the source can be determined, found to be something like the spouse of a sibling of your ancestor. I do not view NADs as a benefit, more like a wild goose chase, but for some people these could be useful so long as the individual understands that these are NOT definitely ancestors and only hints for research. Subscription required.
  • Ancestry uses a proprietary algorithm called Timber to strip DNA from you and your matches that they consider to be “too matchy,” with the idea that those segments are identical by population, meaning likely to be found in large numbers within a population group – making them meaningless for genealogy. The problem is that Timber results in the removal of valid segments, especially in endogamous groups like Acadian families. This function is unique to Ancestry, but many genealogists (me included) don’t consider Timber a benefit.
  • Genetic Communities shows you groups of individuals with whom your DNA clusters. The trees of cluster members are then examined by Ancestry to determine connections from which Genetic Communities are formed. You can filter your DNA match results by Genetic Community.

Genealogy – Autosomal – 23and Me

Unfortunately, the 23andMe website is not straightforward or intuitive. They have spent the majority of the past two years transitioning to a “New Experience” which has resulted in additional confusion and complications when matching between people on multiple different platforms. You can take a spin through the New Experience by clicking here.

23andMe requires people to opt-in to sharing, even after they have selected to participate in Ancestry Services (genealogy) testing, have opted-in previously and chosen to view their DNA Relatives. Users on the “New Experience” can then either share chromosome data and results with each other individually, meaning on a one by one basis, or globally by a one-time opt-in to “open sharing” with matches. If a user does not opt-in to both DNA Relatives and open sharing, sharing requests must be made individually to each match, and they must opt-in to share with each individual user. This complexity and confusion results in an approximate sharing rate of between 50 and 60%. One individual who religiously works their matches by requesting sharing now has a share rate of about 80% of their matches in the data base who HAVE initially selected to participate in DNA Relatives. You can read more about the 23andMe experience at this link.

Various genetic genealogy reports and tools are scattered between the Reports and Tools tabs, and within those, buried in non-intuitive locations. If you are going to utilize 23andMe for matching and genealogy, in addition to the above link, I recommend Kitty Cooper’s blogs about the new DNA Relatives here and on triangulation here. Print the articles, and use them as a guide while navigating the 23andMe site.

Note that some screens (the Tools, DNA Relatives, then DNA tab) on the site do not display/work correctly utilizing Internet Explorer, but do with Edge or other browsers.

The one genealogy feature unique to 23andMe is:

  • Triangulation at 23andMe allows you to select a specific match to compare your DNA against. Several pieces of information will be displayed, the last of which, scrolling to the bottom, is a list of your common relatives with the person you selected.

In the example below, I’ve selected to see the matches I match in common with known family member, Stacy Den (surnames have been obscured for privacy reasons.)  Please note that the Roberta V4 Estes kit is a second test that I took for comparison purposes when the new V4 version of 23andMe was released.  Just ignore that match, because, of course I match myself as a twin.

If an individual does not match both you and your selected match, they will not appear on this list.

In the “relatives in common” section, each person is listed with a “shared DNA” column. For a person to be shown on this “in common” list, you obviously do share DNA with these individuals and they also share with your match, but the “shared DNA” column goes one step further. This column indicates whether or not you and your match both share a common DNA segment with the “in common” person.

I know this is confusing, so I’ve created this chart to illustrate what will appear in the “Shared DNA” column of the individuals showing on the list of matches, above, shared between me and Stacy Den.

Clicking on “Share to see” sends Sarah a sharing request for her to allow you to see her segment matches.

Let’s look at an example with “yes” in the Shared DNA column.

Clicking on the “Yes” in the Shared DNA column of Debbie takes us to the chromosome browser which shows both your selected match, Stacy in my case, and Debbie, the person whose “yes” you clicked.

All three people, meaning me, Stacy and Debbie share a common DNA segment, shown below on chromosome 17.

What 23andMe does NOT say is that these people. Stacy and Debbie, also match each other, in addition to matching me, which means all three of us triangulate.

Because I manage Stacy’s kit at 23andMe, I can check to see if Debbie is on Stacy’s match list, and indeed, Debbie is on Stacy’s match list and Stacy does match both Debbie and me on chromosome 17 in exactly the same location shown above, proving unquestionably that the three of us all match each other and therefore triangulate on this segment. In our case, it’s easy to identify our common relative whose DNA all 3 of us share.

Genealogy – Autosomal Summary

While all 3 vendors offer matching, their interfaces and tools vary widely.

I would suggest that Ancestry is the least sophisticated and has worked hard to make their tools easy for the novice working with genetic genealogy. Their green leaf DNA+Tree Matching is their best feature, easy to use and important for the novice and experienced genealogist alike.  Now, if they just had that chromosome browser so we could see how we match those people.

Ancestry’s Circles, while a nice feature, encourage testers to believe that their DNA or relationship is confirmed by finding themselves in a Circle, which is not the case.

Circles can be formed as the result of misinformation in numerous trees. For example, if I were to inaccurately list Smith as the surname for one of my ancestor’s wives, I would find myself in a Circle for Barbara Smith, when in fact, there is absolutely no evidence whatsoever that her surname is Smith. Yet, people think that Barbara Smith is confirmed due to a Circle having been formed and finding themselves in Barbara Smith’s Circle. Copying incorrect trees equals the formation of incorrect Circles.

It’s also possible that I’m matching people on multiple lines and my DNA match to the people in any given Circle is through another common ancestor entirely.

A serious genealogist will test minimally at Ancestry and at Family Tree DNA, who provides a chromosome browser and other tools necessary to confirm relationships and shared DNA segments.

Family Tree DNA is more sophisticated, so consequently more complex to use.  They provide matching plus numerous other tools. The website and matching is certainly friendly for the novice, but to benefit fully, some experience or additional education is beneficial, not unlike traditional genealogy research itself. This is true not just for Family Tree DNA, but GedMatch and 23andMe who all three utilize chromosome browsers.

The user will want to understand what a chromosome browser is indicating about matching DNA segments, so some level of education makes life a lot easier. Fortunately, understanding chromosome browser matching is not complex. You can read an article about Match Groups and Triangulation here. I also have an entire series of Concepts articles, Family Tree DNA offers a webinar library, their Learning Center and other educational resources are available as well.

Family Tree DNA is the only vendor to provide Phased Family Matches, meaning that by connecting known relatives who have DNA tested to your tree, Family Tree DNA can then identify additional matches as maternal, paternal or both. This, in combination with pseudo-phasing are very powerful matching tools.

23andMe is the least friendly of the three companies, with several genetic genealogy unfriendly restrictions relative to matching, opt-ins, match limits and such. They have experienced problem after problem for years relative to genetic genealogy, which has always been a second-class citizen compared to their medical research, and not a priority.

23andMe has chosen to implement a business model where their customers must opt-in to share segment information with other individuals, either one by one or by opting into open sharing. Based on my match list, roughly 60% of my actual DNA matches have opted in to sharing.

Their customer base includes fewer serious genealogists and their customers often are not interested in genealogy at all.

Having said that, 23andMe is the only one of the three that provides actual triangulated matches for users on the New Experience and who have opted into sharing.

If I were entering the genetic genealogy testing space today, I would test my autosomal DNA at Ancestry and at Family Tree DNA, but I would probably not test at 23andMe. I would test both my Y DNA (if a male) and mitochondrial at Family Tree DNA.

Thank you to Kitty Cooper for assistance with parent/child matching and triangulation at 23andMe.

Genealogy Autosomal Vendor Feature Summary Chart

Family Tree DNA Ancestry 23andMe
Matching Yes Yes Yes – each person has to opt in for open sharing or authorize sharing individually, many don’t
Estimated Relationships Yes Yes Yes
Chromosome Browser Yes No – Large Issue Yes
Chromosome Browser Threshold Adjustment Yes No Chromosome Browser No
X Chromosome Matching Yes No Yes
Trees Yes Yes – subscription required so see matches’ trees No
Ability to upload Gedcom file Yes Yes No
Ability to search trees Yes Yes No
Subscription in addition to DNA test price No No for partial, Yes for full functionality, minimal subscription for $49 by calling Ancestry No
DNA + Ancestor in Tree Matches No Yes – Leaf Hints – subscription required – Best Feature No
Phased Parental Side Matching Yes – Best Feature No No
Parent Match Indicator Yes No Yes
Sort or Group by Parent Match Yes Yes Yes
In Common With Tool Yes Yes Yes
Not In Common With Tool Yes No No
Triangulated Matches No – pseudo with ICW, browser and matrix No Yes – Best Feature
Common Surnames Yes Yes – subscription required No
Ability to Link DNA Matches on Tree Yes No No
Matrix to show match grid between multiple matches Yes No No
Match Filter Tools Yes Minimal Some
Advanced Matching Tool Yes No No
Multiple Test Matching Tool Yes No multiple tests No multiple tests
Ethnicity Matching Yes No Yes
Projects Yes No No
Maximum # of Matches Restricted No No Yes – 2000 unless you are communicating with the individuals, then they are not removed from your match list
All Customers Participate Yes Yes, unless they don’t have a subscription No – between 50-60% opt-in
Accepts Transfers from Other Testing Companies Yes No No
Free Features with Transfer Matching, ICW, Matrix, Advanced Matching No transfers No transfers
Transfer Features Requiring Unlock $ Chromosome Browser, Ethnicity, Ancient Origins, Linked Relationships, Parentally Phased Matches No Transfers No transfers
Archives DNA for Later Testing Yes, 25 years No, no additional tests available No, no additional tests available
Additional Tool DNA Circles – subscription required
Additional Tool New Ancestor Discoveries – subscription required
Y DNA Not included in autosomal test but is additional test, detailed results including matching No Haplogroup only
Mitochondrial DNA Not included in autosomal test but is additional test, detailed results including matching No Haplogroup only
Advanced Testing Available Yes No No
Website Intuitive Yes, given their many tools Yes, very simple No
Data Base Size Large Largest Large but many do not test for genealogy, only test for health
Strengths Many tools, multiple types of tests, phased matching without parent DNA + Tree matching, size of data base Triangulation
Challenges Website episodically times out No chromosome browser or advanced tools Sharing is difficult to understand and many don’t, website is far from intuitive

 

Genealogy – Y and Mitochondrial DNA

Two indispensable tools for genetic genealogy that are often overlooked are Y and mitochondrial DNA.

The inheritance path for Y DNA is shown by the blue squares and the inheritance path for mitochondrial DNA is shown by the red circles for the male and female siblings shown at the bottom of the chart.

Y-DNA Testing for Males

Y DNA is inherited by males only, from their father. The Y chromosome makes males male. Women instead inherit an X chromosome from their father, which makes them female. Because the Y chromosome is not admixed with the DNA of the mother, the same Y chromosome has been passed down through time immemorial.

Given that the Y chromosome follows the typical surname path, Y DNA testing is very useful for confirming surname lineage to an expected direct paternal ancestor. In other words, an Estes male today should match, with perhaps a few mutations, to other descendants of Abraham Estes who was born in 1647 in Kent, England and immigrated to the colony of Virginia.

Furthermore, that same Y chromosome can look far back in time, thousands of years, to tell us where that English group of Estes men originated, before the advent of surnames and before the migration to England from continental Europe. I wrote about the Estes Y DNA here, so you can see an example of how Y DNA testing can be used.

Y DNA testing for matching and haplogroup identification, which indicates where in the world your ancestors were living within the past few hundred to few thousand years, is only available from Family Tree DNA. Testing can be purchased for either 37, 67 or 111 markers, with the higher marker numbers providing more granularity and specificity in matching.

Family Tree DNA provides three types of Y DNA tests.

  • STR (short tandem repeat) testing is the traditional Y DNA testing for males to match to each other in a genealogically relevant timeframe. These tests can be ordered in panels of 37, 67 or 111 markers and lower levels can be upgraded to higher levels at a later date. An accurate base haplogroup prediction is made from STR markers.
  • SNP (single nucleotide polymorphism) testing is a different type of testing that tests single locations for mutations in order to confirm and further refine haplogroups. Think of a haplogroup as a type of genetic clan, meaning that haplogroups are used to track migration of humans through time and geography, and are what is utilized to determine African, European, Asian or Native heritage in the direct paternal line. SNP tests are optional and can be ordered one at a time, in groups called panels for a particular haplogroup or a comprehensive research level Y DNA test called the Big Y can be ordered after STR testing.
  • The Big Y test is a research level test that scans the entire Y chromosome to determine the most refined haplogroup possible and to report any previously unknown mutations (SNPs) that may define further branches of the Y DNA tree. This is the technique used to expand the Y haplotree.

You can read more about haplogroups here and about the difference between STR markers and SNPs here, here and here.

Customers receive the following features and tools when they purchase a Y DNA test at Family Tree DNA or the Ancestry Services test at 23andMe. The 23andMe Y DNA information is included in their Ancestry Services test. The Family Tree DNA Y DNA information requires specific tests and is not included in the Family Finder test. You can click here to read about the difference in the technology between Y DNA testing at Family Tree DNA and at 23andMe. Ancestry is not included in this comparison because they provide no Y DNA related information.

Y DNA Vendor Feature Summary Chart

Family Tree DNA 23andMe
Varying levels of STR panel marker testing Yes, in panels of 37, 67 and 111 markers No
Test panel (STR) marker results Yes Not tested
Haplogroup assignment Yes – accurate estimate with STR panels, deeper testing available Yes –base haplogroup by scan – haplogroup designations are significantly out of date, no further testing available
SNP testing to further define haplogroup Yes – can purchase individual SNPs, by SNP panels or Big Y test No
Matching to other participants Yes No
Trees available for your matches Yes No
E-mail of matches provided Yes No
Calculator tool to estimate probability of generational distance between you and a match Yes No
Earliest known ancestor information Yes No
Projects Surname, haplogroup and geographic projects No
Ability to search Y matches Yes No Y matching
Ability to search matches within projects Yes No projects
Ability to search matches by partial surname Yes No
Haplotree and customer result location on tree Yes, detailed with every branch Yes, less detailed, subset
Terminal SNP used to determine haplogroup Yes Yes, small subset available
Haplogroup Map Migration map Heat map
Ancestral Origins – summary by ancestral location of others you match, by test level Yes No
Haplogroup Origins – match ancestral location summary by haplogroup, by test level Yes No
SNP map showing worldwide locations of any selected SNP Yes No
Matches map showing mapped locations of your matches most distant ancestor in the paternal line, by test panel Yes No
Big Y – full scan of Y chromosome for known and previously unknown mutations (SNPs) Yes No
Big Y matching Yes No
Big Y matching known SNPs Yes No
Big Y matching novel variants (unknown or yet unnamed SNPs) Yes No
Filter Big Y matches Yes No
Big Y results Yes No
Advanced matching for multiple test types Yes No
DNA is archived so additional tests or upgrades can be ordered at a later date Yes, 25 years No

Mitochondrial DNA Testing for Everyone

Mitochondrial DNA is contributed to both genders of children by mothers, but only the females pass it on. Like the Y chromosome, mitochondrial DNA is not admixed with the DNA of the other parent. Therefore, anyone can test for the mitochondrial DNA of their matrilineal line, meaning their mother’s mother’s mother’s lineage.

Matching can identify family lines as well as ancient lineage.

You receive the following features and tools when you purchase a mitochondrial DNA test from Family Tree DNA or the Ancestry Services test from 23andMe. The Family Tree DNA mitochondrial DNA information requires specific tests and is not included in the Family Finder test. The 23andMe mitochondrial information is provided with the Ancestry Services test. Ancestry is omitted from this comparison because they do not provide any mitochondrial information.

Mitochondrial DNA Vendor Feature Summary Chart

Family Tree DNA 23andMe
Varying levels of testing Yes, mtPlus and Full Sequence No
Test panel marker results Yes, in two formats, CRS and RSRS No
Rare mutations, missing and extra mutations, insertions and deletions reported Yes No
Haplogroup assignment Yes, most current version, Build 17 Yes, partial and out of date version
Matching to other participants Yes No
Trees of matches available to view Yes No
E-mail address provided to matches Yes No
Earliest known ancestor information Yes No
Projects Surname, haplogroup and geographic available No
Ability to search matches Yes No
Ability to search matches within project Yes No projects
Ability to search match by partial surname Yes No
Haplotree and customer location on tree No Yes
Mutations used to determine haplogroup provided Yes No
Haplogroup Map Migration map Heat map
Ancestral Origins – summary by ancestral location of others you match, by test level Yes No
Haplogroup Origins –match ancestral location summary by haplogroup Yes No
Matches map showing mapped locations of your matches most distant ancestor in the maternal line, by test level Yes No
Advanced matching for multiple test types Yes No
DNA is archived so additional tests or upgrades can be ordered at a later date Yes, 25 years No

 

Overall Genealogy Summary

Serious genealogists should test with at least two of the three major vendors, being Family Tree DNA and Ancestry, with 23andMe coming in as a distant third.

No genetic genealogy testing regimen is complete without Y and mitochondrial DNA for as many ancestral lines as you can find to test. You don’t know what you don’t know, and you’ll never know if you don’t test.

Unfortunately, many people, especially new testers, don’t know Y and mitochondrial DNA testing for genetic genealogy exists, or how it can help their genealogy research, which is extremely ironic since these were the first tests available, back in 2000.

You can read about finding Y and mitochondrial information for various family lines and ancestors and how to assemble a DNA Pedigree Chart here.

You can also take a look at my 52 Ancestors series, where I write about an ancestor every week. Each article includes some aspect of DNA testing and knowledge gained by a test or tests, DNA tool, or comparison. The DNA aspect of these articles focuses on how to use DNA as a tool to discover more about your ancestors.

Testing for Medical/Health or Traits

The DTC market also includes health and medical testing, although it’s not nearly as popular as genetic genealogy.

Health/medical testing is offered by 23andMe, who also offers autosomal DNA testing for genealogy.

Some people do want to know if they have genetic predispositions to medical conditions, and some do not. Some want to know if they have certain traits that aren’t genealogically relevant, but might be interesting – such as whether they carry the Warrior gene or if they have an alcohol flush reaction.

23andMe was the first company to dip their toes into the water of Direct to Consumer medical information, although they called it “health,” not medicine, at that time. Regardless of the terminology, information regarding Parkinson’s and Alzheimer’s, for example, were provided for customers. 23andMe attempted to take the raw data and provide the consumer with something approaching a middle of the road analysis, because sometimes the actual studies provide conflicting information that might not be readily understood by consumers.

The FDA took issue with 23andMe back in November of 2013 when they ordered 23andMe to discontinue the “health” aspect of their testing after 23andMe ignored several deadlines. In October 2015, 23andMe obtained permission to provide customers with some information, such as carrier status, for 36 genetic disorders.

Since that time, 23andMe has divided their product into two separate tests, with two separate prices. The genealogy only test called Ancestry Service can be purchased separately for $99, or the combined Health + Ancestry Service for $199.

If you are interested in seeing what the Health + Ancestry test provides, you can click here to view additional information.

However, there is a much easier and less expensive solution.

If you have taken the autosomal test from 23andMe, Ancestry or Family Tree DNA, you can download your raw data file from the vendor and upload to Promethease to obtain a much more in-depth report than is provided by 23andMe, and much less expensively – just $5.

I reviewed the Promethease service here. I found the Promethease reports to be very informative and I like the fact that they provide information, both positive and negative for each SNP (DNA location) reported. Promethease avoids FDA problems by not providing any interpretation or analysis, simply the data and references extracted from SNPedia for you to review.

I would be remiss if I didn’t mention that you should be sure you really want to know before you delve into medical testing. Some mutations are simply indications that you could develop a condition that you will never develop or that is not serious. Other mutations are not so benign. Promethease provides this candid page before you upload your data.

Different files from different vendors provide different results at Promethease, because those vendors test different SNP locations in your DNA. At the Promethease webpage, you can view examples.

Traits

Traits fall someplace between genealogy and health. When you take the Health + Ancestry test at 23andMe, you do receive information about various traits, as follows:

Of course, you’ll probably already know if you have several of these traits by just taking a look in the mirror, or in the case of male back hair, by asking your wife.

At Family Tree DNA, existing customers can order tests for Factoids (by clicking on the upgrade button), noted as curiosity tests for gene variants.

Family Tree DNA provides what I feel is a great summary and explanation of what the Factoids are testing on their order page:

“Factoids” are based on studies – some of which may be controversial – and results are not intended to diagnose disease or medical conditions, and do not serve the purpose of medical advice. They are offered exclusively for curiosity purposes, i.e. to see how your result compared with what the scientific papers say. Other genetic and environmental variables may also impact these same physiological characteristics. They are merely a conversational piece, or a “cocktail party” test, as we like to call it.”

Test Price Description
Alcohol Flush Reaction $19 A condition in which the body cannot break down ingested alcohol completely. Flushing, after consuming one or two alcoholic beverages, includes a range of symptoms: nausea, headaches, light-headedness, an increased pulse, occasional extreme drowsiness, and occasional skin swelling and itchiness. These unpleasant side effects often prevent further drinking that may lead to further inebriation, but the symptoms can lead to mistaken assumption that the people affected are more easily inebriated than others.
Avoidance of Errors $29 We are often angry at ourselves because we are unable to learn from certain experiences. Numerous times we have made the wrong decision and its consequences were unfavorable. But the cause does not lie only in our thinking. A mutation in a specific gene can also be responsible, because it can cause a smaller number of dopamine receptors. They are responsible for remembering our wrong choices, which in turn enables us to make better decisions when we encounter a similar situation.
Back Pain $39 Lumbar disc disease is the drying out of the spongy interior matrix of an intervertebral disc in the spine. Many physicians and patients use the term lumbar disc disease to encompass several different causes of back pain or sciatica. A study of Asian patients with lumbar disc disease showed that a mutation in the CILP gene increases the risk of back pain.
Bitter Taste Perception $29 There are several genes that are responsible for bitter taste perception – we test 3 of them. Different variations of this gene affect ability to detect bitter compounds. About 25% of people lack ability to detect these compounds due to gene mutations. Are you like them? Maybe you don’t like broccoli, because it tastes too bitter?
Caffeine Metabolism $19 According to the results of a case-control study reported in the March 8, 2006 issue of JAMA, coffee is the most widely consumed stimulant in the world, and caffeine consumption has been associated with increased risk for non-fatal myocardial infarction. Caffeine is primarily metabolized by the cytochrome P450 1A2 in the liver, accounting for 95% of metabolism. Carriers of the gene variant *1F allele are slow caffeine metabolizers, whereas individuals homozygous for the *1A/*1A genotype are rapid caffeine metabolizers.
Earwax Type $19 Whether your earwax is wet or dry is determined by a mutation in a single gene, which scientists have discovered. Wet earwax is believed to have uses in insect trapping, self-cleaning and prevention of dryness in the external auditory canal of the ear. It also produces an odor and causes sweating, which may play a role as a pheromone.
Freckling $19 Freckles can be found on anyone no matter what the background. However, having freckles is genetic and is related to the presence of the dominant melanocortin-1 receptor MC1R gene variant.
Longevity $49 Researchers at Harvard Medical School and UC Davis have discovered a few genes that extend lifespan, suggesting that the whole family of SIR2 genes is involved in controlling lifespan. The findings were reported July 28, 2005 in the advance online edition of Science.
Male Pattern Baldness $19 Researchers at McGill University, King’s College London and GlaxoSmithKline Inc. have identified two genetic variants in Caucasians that together produce an astounding sevenfold increase of the risk of male pattern baldness. Their results were published in the October 12, 2008 issue of the Journal of Nature Genetics.
Monoamine Oxidase A (Warrior Gene) $49.50 The Warrior Gene is a variant of the gene MAO-A on the X chromosome. Recent studies have linked the Warrior Gene to increased risk-taking and aggressive behavior. Whether in sports, business, or other activities, scientists found that individuals with the Warrior Gene variant were more likely to be combative than those with the normal MAO-A gene. However, human behavior is complex and influenced by many factors, including genetics and our environment. Individuals with the Warrior Gene are not necessarily more aggressive, but according to scientific studies, are more likely to be aggressive than those without the Warrior Gene variant. This test is available for both men and women, however, there is limited research about the Warrior Gene variant amongst females. Additional details about the Warrior Gene genetic variant of MAO-A can be found in Sabol et al, 1998.
Muscle Performance $29 A team of researchers, led by scientists at Dartmouth Medical School and Dartmouth College, have identified and tested a gene that dramatically alters both muscle metabolism and performance. The researchers say that this finding could someday lead to treatment of muscle diseases, including helping the elderly who suffer from muscle deterioration and improving muscle performance in endurance athletes.
Nicotine Dependence $19 In 2008, University of Virginia Health System researchers have identified a gene associated with nicotine dependence in both Europeans and African Americans.

Many people are interested in the Warrior Gene, which I wrote about here.

At Promethease, traits are simply included with the rest of the conditions known to be associated with certain SNPs, such as baldness, for example, but I haven’t done a comparison to see which traits are included.

 

Additional Vendor Information to Consider

Before making your final decision about which test or tests to purchase, there are a few additional factors you may want to consider.

As mentioned before, Ancestry requires a subscription in addition to the cost of the DNA test for the DNA test to be fully functional.

One of the biggest issues, in my opinion, is that both 23andMe and Ancestry sell customer’s anonymized DNA information to unknown others. Every customer authorizes the sale of their information when they purchase or activate a kit – even though very few people actually take the time to read the Terms and Conditions, Privacy statements and Security documents, including any and all links. This means most people don’t realize they are authorizing the sale of their DNA.

At both 23andMe and Ancestry, you can ALSO opt in for additional non-anonymized research or sale of your DNA, which you can later opt out of. However, you cannot opt out of the lower level sale of your anonymized DNA without removing your results from the data base and asking for your sample to be destroyed. They do tell you this, but it’s very buried in the fine print at both companies. You can read more here.

Family Tree DNA does not sell your DNA or information.

All vendors can change their terms and conditions at any time. Consumers should always thoroughly read the terms and conditions including anything having to do with privacy for any product they purchase, but especially as it relates to DNA testing.

Family Tree DNA archives your DNA for later testing, which has proven extremely beneficial when a family member has passed away and a new test is subsequently introduced or the family wants to upgrade a current test.  Had my mother’s DNA not been archived at Family Tree DNA, I would not have Family Finder results for her today – something I thank Mother and Family Tree DNA for every single day.

Family Tree DNA also accepts transfer files from 23andMe, Ancestry and very shortly, MyHeritage – although some versions work better than others. For details on which companies accept which file versions, from which vendors, and why, please read Autosomal DNA Transfers – Which Companies Accept Which Tests?

If you tested on a compatible version of the 23andMe Test (V3 between December 2010 and November 2013) or the Ancestry V1 (before May 2016) you may want to transfer your raw data file to Family Tree DNA for free and pay only $19 for full functionality, as opposed to taking the Family Finder test. Family Tree DNA does accept later versions of files from 23andMe and Ancestry, but you will receive more matches if you test on the same chip platform that Family Tree DNA utilizes instead of doing a transfer.

Additional Vendor Considerations Summary Chart

Family Tree DNA Ancestry 23andMe
Subscription required in addition to cost of DNA test No Yes for full functionality, partial functionality is included without subscription, minimum subscription is $49 by calling Ancestry No
Customer Support Good and available Available, nice but often not knowledgeable about DNA Poor
Sells customer DNA information No Yes Yes
DNA raw data file available to download Yes Yes Yes
DNA matches file available to download including match info and chromosome match locations Yes No Yes
Customers genealogically focused Yes Yes Many No
Accepts DNA raw data transfer files from other companies Yes, most, see article for specifics No No
DNA archived for later testing Yes, 25 years No No
Beneficiary provision available Yes No No

 

Which Test is Best For You?

I hope you now know the answer as to which DNA test is best for you – or maybe it’s multiple tests for you and other family members too!

DNA testing holds so much promise for genealogy. I hesitate to call DNA testing a miracle tool, but it often is when there are no records. DNA testing works best in conjunction with traditional genealogical research.

There are a lot of tests and options.  The more tests you take, the more people you match. Some people test at multiple vendors or upload their DNA to third party sites like GedMatch, but most don’t. In order to make sure you reach those matches, which may be the match you desperately need, you’ll have to test at the vendor where they tested. Otherwise, they are lost to you. That means, of course, that eventually, if you’re a serious genealogist, you’ll be testing at all 3 vendors.  Don’t forget about Y and mitochondrial tests at Family Tree DNA.

Recruit family members to test and reach out to your matches.  The more you share and learn – the more is revealed about your ancestors. You are, after all, the unique individual that resulted from the combination of all of them!

Update: Vendor prices updated June 22, 2017.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Introducing the Match-Maker-Breaker Tool for Parental Phasing

A few days after I published the article, Concepts – Segment Size, Legitimate and False Matches, Philip Gammon, a statistician who lives in Australia, posted a comment to my blog.

Great post Roberta! I’m a statistician so my eyes light up as soon as I see numbers. That table you have produced showing by segment length the percentage that are IBD is one of the most useful pieces of information that I have seen. Two days to do the analysis!!! I’m sure that I could write a formula that would identify the IBD segments and considerably reduce this time.

By this time, my eyes were lighting up too, because the work for the original article had taken me two days to complete manually, just using segments 3 cM and above. Using smaller segments would have taken days longer. By manually, I mean comparing the child’s matches with that of both parents’ matches to see which, if either, parent the child’s match also matches on the same segment.

In the simplest terms, the Segment Size article explained how to copy the child’s and both parents’ matches to a spreadsheet and then manually compare the child’s matches to those of the parents. In the example above, you can see that both the child and the mother have matches to Cecelia. As it turns out, the exact same segment of DNA was passed in its entirety to the child from the mother, who is shown in pink – so Cecelia matches both the child and the parent on exactly the same segment.

That’s not always the case, and the Segment Size article went into much greater detail.

For the past month or so, Philip and I have been working back and forth, along with some kind volunteers who tested Philip’s new tool, in order to create something so that you too can do this comparison and in much less than two days.

Foundation

Here’s the underlying principle for this tool – if a child has a match that does NOT match either parent on the same segment, then the match is not a legitimate match. It’s a false match, identical by chance, and it is NOT genealogically relevant.

If the child’s match also matches either parent on the same segment, it is most likely a match by descent and is genealogically relevant.

For those of you who noticed the words “most likely,” yes, it is possible for someone to match a parent and child both and still not phase (or match) to the next higher generation, but it’s unusual and so far, only found in smaller segments. I wrote about multiple generation phasing in the article, “Concepts – Segment Survival – 3 and 4 Generation Phasing.” Once a segment phases, it tends to continue phasing, especially with segments above about 3.5 cM.

For those who have both parents available to test, phased matching is a HUGE benefit.

But I Have Only One Parent Available

You can still use the tool to identify matches to that one parent, but you CANNOT presume that matches that DON’T match that parent are from the other (missing) parent. Matches matching the child but not matching the tested parent can be due to:

  • A match to the missing parent
  • A false match that is not genealogically relevant

According to the statistics generated from Philip’s Match-Maker-Breaker tool, shown below, segments 9 cM and above tend to match one or the other parent 90% or more of the time.  Segments 12 cM and over match 97% of the time or more, so, in general, one could “assume” (dangerous word, I know) that segments of this size that don’t match to the tested parent would match to the other parent if the other parent was available. You can also see that the reliability of that assumption drops rapidly as the segment sizes get smaller.

Platform

This tool was written utilizing Microsoft Excel and only works reliably on that platform.

If you are using Excel and are NOT attempting to use MAC Numbers, skip this section.  If you want to attempt to use Numbers, read this section.

I tried, along with a MAC person, to try to coax Numbers (free MAC spreadsheet) into working. If you have any other option other than using Numbers, so do. Microsoft Excel for MAC seemed to work fine, but it was only tested on one MAC.

Here’s what I discovered when trying to make Numbers work:

  • You must first launch numbers and then select the various spreadsheets.
  • The tabs are not at the bottom and are instead at the top without color.
  • The instructions for copying the formulas in cells H2-K2 throughout the spreadsheet must be done manually with a copy/paste.
  • After the above step, the calculations literally took a couple hours (MacBook Air) instead of a couple minutes on the PC platform. The older MAC desktop still took significantly longer than on a Microsoft PC, but less time than the solid state MacBook Air.
  • After the calculations complete, the rows on the child’s spreadsheet are not colored, which is one of the major features of the Match-Maker-Breaker tool, as Numbers reports that “Conditional highlighting rules using formulas are not supported and were removed.”
  • Surprisingly, the statistical Reports page seems to function correctly.

How Long Does Running Match-Maker-Breaker Tool on a PC Take?

The first time I ran this tool, which included reading Philip’s instructions for the first time, the entire process took me about 10 minutes after I downloaded the files from Family Tree DNA.

Vendors

This tool only works with matches downloaded from Family Tree DNA.

Transfer Kits

It’s strongly suggested that all 3 individuals being compared have tested at Family Tree DNA or on the same chip version imported into Family Tree DNA.

Matches not run on the same chip as Family Tree DNA testers can only provide a portion of the matches that the same person’s results run on the FTDNA chip can provide. You can run the matching tool with transferred results, but the results will only provide a subset of the results that will be provided by having all parties that are being compared, meaning the child and both parents, test at Family Tree DNA.

The following products versions CAN be all be compared successfully at Family Tree DNA, as they all utilize the same Illumina chip:

  • All Family Finder tests
  • Ancestry V1 (before May 2016)
  • 23andMe V3 (before November 2013)
  • MyHeritage

The following tests do NOT utilize the same Illumina testing platform and cannot be compared successfully with Family Finder tests from Family Tree DNA, or the list above. Cross platform testing results cannot be reliably compared. Those that DO match will be accurate, but many will not match that would match if all 3 testers were utilizing the same platform, therefore leading you to inaccurate conclusions.

  • Ancestry V2 (beginning in May 2016 to present)
  • 23andMe V4 (beginning November 2013 to present)

The child and two parents should not be compared utilizing mixed platforms – meaning, for example, that the child should not have been tested at FTDNA and the parents transferred from Ancestry on the V2 platform since May 2016.

If any of the three family members, being the child or either parent, have tested on an incompatible platform, they should retest at Family Tree DNA before using this tool.

What You Need

  • You will need to download the chromosome match lists from the child and both parents, AT THE SAME TIME. I can’t stress this enough, because any matches that have been added for either of the three people at a later time than the others will skew the matching and the statistics. Matches are being added all the time.
  • You will also need a relatively current version of Excel on your computer to run this tool. No, I did not do version compatibility testing so I don’t know how old is too old. I am running MSOffice 2013.
  • You will need to know how to copy and paste data from and to a spreadsheet.

Instructions for Downloading Match Files

My recommendation is that you download your matches just before utilizing this tool.

To download your matches, sign on to each account. On your main page, you will see the Family Finder section, and the Chromosome Browser. Click on that link.

At the top of the chromosome browser page, below, you’ll see the image of chromosomes 1 through X. At the top right, you’ll see the option to “Download all matches to Excel (CSV Format). Click on that link.

Next, you’ll receive a prompt to open or save the file. Save it to a file name that includes the name of the person plus the date you did the download. I created a separate folder so there would be no confusion about which files are which and whether or not they are current.

Your match file includes all of your matches and the chromosome matching locations like the example shown below.

These files of matches are what you’ll need to copy into the Match-Maker-Breaker spreadsheet.

Do not delete any information from your match spreadsheets. If you normally delete small segments, don’t. You may cause a non-match situation if the parent carries a larger portion of the same segment.

You can rerun the Match-Maker-Breaker tool at will, and it only takes a very few minutes.

The Match-Maker-Breaker Tool

The Match-Maker-Breaker Tool has 5 sheets when you open the spreadsheet:

  • Instructions – Please read entirely before beginning.
  • Results – The page where your statistical results will be placed.
  • Child – The page where you will paste the child’s matches and then look at the match results after processing.
  • Father – The page where you will paste the father’s matches.
  • Mother – The page where you will paste the mother’s matches.

Download

Download the free Match-Maker-Breaker tool which is a spreadsheet by clicking on this link: Match-Maker-Breaker Tool V2

Please don’t start using the tool before reading the instructions completely and reading the rest of this article.

Make a Copy

After you download the tool, make a copy on your system. You’ll want to save the Match-Maker-Breaker spreadsheet file for each trio of people individually, and you’ll want a fresh Match-Maker-Breaker spreadsheet copy to run with each new set of download files.

Instructions

I’m not going to repeat Philip’s instructions here, but please read them entirely before beginning and please follow them exactly. Philip has included graphic illustrations of each step to the right of the instruction box. The spreadsheet opens to the Instructions page. You can print the instruction page as well.

Copy/Pasting Data

When copying the parents’ and child’s data into the spreadsheets, do NOT copy and paste the entire page by selecting the page. Select and copy the relevant columns by highlighting columns A through G by touching your cursor to the A-G across the top, as shown below.  After they are selected, then click on “copy.” In the child’s chromosome browser download spreadsheet, position the curser in the first cell in row 1 in the child’s page of the Match-Maker-Breaker spreadsheet and click on “paste.”

Do NOT select columns H-K when highlighting and copying, or your paste will wipe out Philip’s formulas to do calculations on the child’s tab on the spreadsheet.

The example above, assuming that Annie is the last entry on the spreadsheet, shows that I’ve highlighted all of the cells in columns A-G, prior to executing the copy command. Your spreadsheets of course will be much longer.

I wrote a very quick and dirty article about using Excel here

The Match Making Breaking Part

After you copy the formulas from rows H2 to K2 through the rest of the spreadsheet by following Philip’s instructions, you’ll see the results populating in the status bar at the bottom. You’ll also see colors being added to the matches on the left hand side of the spreadsheet page and counts accruing in the 4 right columns. Be patient and wait. It may take a few minutes. When it’s finished, you can verify by scrolling to the last row on the child’s page and you’ll see something like the example below, where every row has been assigned a color and every match that matches the child and the father, mother, both or is found in the HLA region is counted as 1 in the right 4 columns.

In this example, 5 segments, shown in grey, don’t match anyone, one, shown in tan is found in the HLA region, and three match the father, in blue.

Output

After you run the Match-Maker-Breaker tool, the child’s matches on the Child tab will be identified as follows:

This means that segment of the child that matches that individual also matches the father, the mother, both parents, the HLA region, or none of the above on all or part of that same segment.

What is a Match?

Philip and I worked to answer the question, “what is a match?” In the Concepts article, I discussed the various kinds of matches.

  • Full match: The child’s match and parent’s match share the same exact segment, meaning same start and end points and same number of SNPs within that segment.
  • Partial match: The child’s match matches a portion of the segment from the parent – meaning that the child inherited part of the segment, but not the entire segment.
  • Overhanging match: The child’s match matches part or all of the parent’s segment, but either the beginning or end extends further than the parents match. This means that the overlapping portion is legitimate, meaning identical by descent (IBD), but the overhanging portion is identical by chance (IBC.)
  • Nested match: The child’s match is smaller than the match to the parent, but fully within the parent’s match, indicating a legitimate match.
  • No match: The person matches the child, but neither parent, meaning that this match is not legitimate. It’s identical by chance (IBC).

Full matches and no matches are easy.

However, partial matches, overlapping matches and nested matches are not as straightforward.

What, exactly, is a match? Let’s look at some different scenarios.

If someone matches a parent on a large segment, say 20cM, and only matches the child on 2cM, fully within the parent’s segment, is this match genealogically relevant, or could the match be matching the child by chance on a part of the same segment that they match the parents by descent? We have no way to know for sure, just utilizing this tool. Hopefully, in this case, the fact that the person matches the parent on a large segment would answer any genealogical questions through triangulation.

If the person matches the parent but only matches the child on a small portion of the same segment plus an overhanging region, is that a valid match? Because they do match on an overhanging region, we know that match is partly identical by chance, but is the entire match IBC or is the overlapping part legitimate? We don’t know. Partly, how strongly I would consider this a valid match would be the size of the matching portion of the segment.

One of the purposes of phasing and then looking at matches is to, hopefully, learn more about which matches are legitimate, which are not, and predictors of false versus legitimate matches.

Relative to this tool, no editing has been done, meaning that matches are presented exactly as that, regardless of their size or the type of match. A match is a match if any portion of the match’s DNA to the child overlaps any portion of either or both parent’s DNA, with the exception of part of chromosome 6. It’s up to you, as the genealogist, to figure out by utilizing triangulation and other tools whether the match is relevant or not to your genealogy.

If you are not familiar with identical by descent (meaning a legitimate match), identical by population (IBP) meaning identical by descent but because the population as a whole carries that segment and identical by chance (IBC) meaning a false match, the article Identical by…Descent, State, Population and Chance explains the terms and the concepts so that you can apply them usefully.

About Chromosome 6

After analyzing the results of several people, the area of chromosome 6 that includes the HLA region has been excluded from the analysis. Long known to be a pileup region where people carry significant segments of the same DNA that is not genealogically relevant (meaning IBP or identical by population,) this region has found to be often unreliable genealogically, and falls outside the norm as compared to the rest of the segments. This area has been annotated separately and excluded from match results. This was the only region found to universally have this effect.

This does not mean that a match in this region is positively invalid or false, but matches in the HLA region should be viewed very skeptically.

The Results Tab – Statistics

Now that you’ve populated the spreadsheet and you can see on the Child tab which matches also match either or both parents, or neither, or the HLA region, go to the Results tab of the spreadsheet.

This tab gives you some very interesting statistics.

First, you’ll see the number and percent of matches by chromosome.

The person compared was a female, so she would have X matches to both parents. However, notice that X matching is significantly lower than any of the other chromosomes.

Frankly, I’ve suspected for a long time that there was a dramatic difference in matching with the X chromosome, and wrote about it here. It was suggested by some at the time that I was only reporting my personal observations that would not hold beyond a few results (ascertainment bias), but this proves that there is something different about X chromosome matching. I don’t know what or why, but according to this data that is consistent between all of the beta testers, matching to the X chromosome is much less reliable.

The second statistics box you will see are statistics for the matches to the child that also match the parents. The actual matches of the child to the parents are shown as the 23 shown under “excluded from calculations.”

The next group of statistics on your page will be your own, but for this example, Philip has combined the results from several beta testers and provided summary information, so that the statistics are not skewed by any one individual.

Next, the match results by segment size for chromosomes 1-22. Philip has separated out segments with less than 500 SNPs and reports them separately.

You will note that 90% or more of the segments 9 cM and above match one of the two parents, and 97% or more of segments 12cM or above.

The X chromosome follows, analyzed separately. You’ll notice that while 27% of the matches on chromosomes 1-22 match one or both parents, only 14% of the X matches do.

Even with larger segments, not all X segments match both the child and the parents, suggesting that skepticism is warranted when evaluating X chromosome matches.

Philip then calculated a nice graph for showing matching autosomal segments by cM size, excluding the X.

The next set of charts shows matches by SNP density. Many people neglect SNP count when evaluating results, but the higher the SNP count, the more robust the match.

Note that SNP density above 2,200 almost always matched, but not always, while SNP density of 2,800 reaches the 97% threshold..

The X chromosome, by SNP count, below.

X segment reach the 100% threshold about 1600, however, we really need more results to be predictive at the same level as the results for chromosomes 1-22.  Two data samples really isn’t adequate.

Once again, Philip prepared a nice chart showing percentage of matching segments by SNP count, below.

Predictive

In the Segment Survival – 3 and 4 Generation Phasing article, one can see that phased matches are predictive, meaning that a child/parent match is highly suggestive that the segment is a valid segment match and that it will hold in generations further upstream.

Several years ago, Dr. Tim Janzen, one of the early phasing pioneers, suggested that people test their children, even if both parents had already tested. For the life of me, I couldn’t understand how that would be the least bit productive, genealogically, since people were more likely to match the parents than the children, and children only carry a subset of their parent’s DNA.

However, the predictive nature of a segment being legitimate with a child/parent match to a third party means that even in situations where your own parent isn’t available, a match by a third party on the same segment with your child suggests that the match is legitimate, not IBC.

In the article, I showed both 3 and 4 generations of phased comparisons between generations of the same family and a known cousin. The results of the 5 different family comparisons are shown below, where the red segments did not phase or lost phasing between generations, and the green segments did phase through multiple generations.

Very, very few segments lost phasing in upper (older) generations after matching between a parent and a child. In the five 4-generation examples above, only a total of 7 groups of segments lost phasing. The largest segment that lost phasing in upper generations was 3.69 cM. In two examples, no segments were lost due to not phasing in upper generations.

The net-net of this is that you can benefit by testing your children if your parents aren’t available, because the matches on the segment to both you and the child are most likely to be legitimate. Of course, there will be segments where someone matches you and not your child, because your child did not inherit that segment of your DNA, and those may be legitimate matches as well. However, the segments where you and your child both match the same person will likely be legitimate matches, especially over about 3.5 cM. Please read the Segment Survival article for more details.

If you want to order additional Family Finder tests for more family members, you can click here.

Group Analysis

Philip has performed a group analysis which has produced some expected results along with some surprising revelations. I’d prefer to let people get their feet wet with this tool and the results it provides before publishing the results, with one exception.

In case you’re wondering if the comparisons used as examples, above, are representative of typical results, Philip analyzed 10 of our beta testers and says the following:

The results are remarkably consistent between all 10 participants. Summing it up in words: with each person that you match you will have an average of 11 matching segments. Three will be genuine and will add to [a total of] 21 cM. Eight will be false and add to [a total of] 19 cM.

Philip compiled the following chart summarizing 10 beta testers’ results. Please note that you can click to enlarge the images.

The X, being far less consistent, is shown below.

We Still Need Endogamous Parent-Child Trios

When I asked for volunteer testers, we were not able to obtain a trio of fully endogamous individuals. Specifically, we would like to see how the statistics for groups of non-endogamous individuals compare to the statistics for endogamous individuals.

Endogamous groups include people who are 100% Jewish, Amish, Mennonite, or have a significant amount of first or second cousin marriages in recent generations.

Of these, Jewish families prove to be the most highly endogamous, so if you are Jewish and have both Jewish parents’ DNA results, please run this tool and send either Philip or me the resulting spreadsheet. Your results won’t be personally identified, only the statistics used in conjunction with others, similar to the group analysis shown above. Your results will be entirely anonymous.

Philip’s e-mail is philip.gammon@optusnet.com.au and you can reach me at roberta@dnaexplain.com.

Caveat

Philip has created the Match-Maker-Breaker tool which is free to everyone. He has included some wonderful diagnostics, but Philip is not providing individual support for the tooI. In other words, this is a “what you see is what you get” gift.

Thank You and Acknowledgements

Of course, a very big thank you to Philip for creating this tool, and also to people who volunteered as alpha and beta testers and provided feedback. Also thanks to Jim Kvochick for trying to coax Numbers into working.

Match-Maker-Breaker Author Bio:

Philip’s official tagline reads: Philip Gammon, BEng(ManSysEng) RMIT, GradDipSc(AppStatistics) Swinburne

I asked Philip to describe himself.

I’d describe myself as a business analyst with a statistics degree plus an enthusiastic genetic genealogist with an interest in the mathematical and statistical aspects of inheritance and cousinship.

The important aspect of Philip’s resume is that he is applying his skills to genetic genealogy where they can benefit everyone. Thank you so much Philip.

Watch for some upcoming guest articles from Philip.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Concepts – Segment Survival – 3 and 4 Generation Phasing

Have you ever had something you need to refer back to and can’t find it? I do this more often than I care to admit.

About a year ago, I did a study when I was writing the “Concepts – Parental Phasing” article where I tracked segment matches from generation to generation through three generations.

I wanted to see how small versus large segments faired during the phasing process with a known relative. In other words, if a known relative matches a child and a parent on the same segment, does that known relative also match the relevant grandparent on that same segment, or is that match ”lost” in the older generation.

This first example shows the tester matching all 4 generations of the Curtis lineage.

The second example, below, shows the Tester matching only the two youngest generations, but not the Grandparent or Great-grandparent.

Obviously, the tester cannot match the child and parent without also matching the grandparent and great-grandparents, who have also tested, for the segment to be genealogically relevant, meaning passed from the common ancestor to both the tester and the descendants in the Curtis line.  For the match between the tester and the parent/child to be valid, meaning the DNA descended from the common ancestor, the DNA segment MUST also be carried by the Grandparent and Great-grandmother.

If the segment matches all four people, then it phases through all generations and is a solid phased match.

If the segment matches only two contiguous generations, and not the older generation, as shown above, the segment is identical by chance in the younger generations, and is not genealogically relevant.

A third situation is clearly possible, where the tester matches the older generation or generations, but not the younger. In this case, the DNA simply did not get passed on down to the younger generations. In the example shown below, the segment still phases between the Grandparent and the Great-grandmother.

I’ve extracted the results from the original article and am showing them here, along with a 4 generation study utilizing 5 different examples.

The results are important because they were unexpected, as far as I was concerned.

Let’s take a look at the original results first.

Original Study – 3 Generations – 2 Meiosis

In the first study comparing three generations, I compared four different groups of people to a known relative in their family line. None of the family groups included any of the same people.

If the known relative matches the youngest generations, meaning the child and the parent, both, the location was colored green. This means the match phased through one generation. If the known relative also matched the third generation, the grandparent, on that same location, the location remained green. If the known relative did not match the oldest generation in addition to the child and the parent, then the location was changed to red, because the phasing was lost.

Green means that the matches did phase in all three generations and red means they either did not phase or the phasing was “lost” in the older generation.  Lost, in this instance, means the DNA match never happened and it was “lost” during the analysis process.

I followed this same process for 4 separate groups of three individuals, resulting in the following distribution of matching segments through all three generations (green), versus segments that matched the younger two generations but not the older generation (red) or don’t phase at all, meaning they match only one of the two younger relatives.

I marked what appears to be a threshold with a black line.

As you can see, the phasing threshold cutoff appears to be someplace between 2.46 and 3.16 cM. These matches are through Family Tree DNA, so all SNPs will be 500 or over. In other words, almost all segments below that line phased to all three generations. Many or most segments above that line were lost in upstream generations. This means they were false matches, or identical by chance (IBC).

More segments phased to earlier generations than I expected.  I was especially surprised at the number of small segments and the low threshold, so I was anxious to see if the pattern held when utilizing 4 generations which involves 3 meiosis..

New Study – 4 Generations – 3 Meiosis

In any one generation, a match can occur by chance, but once the match has phased through the parent’s generation, meaning the cousin matches the child AND the parent on the same segment, it’s easy to assume that they would, logically, match through the next two generations upwards as well. But do they? Let’s take a look.

Instead of just the summary information provided in the 3 generation study, I’m going to be showing you the three steps in the evaluation process for each example we discuss. I think it will help to answer questions, as well as to enable you to follow these same steps for your own family.

In total, I did 5 separate 4 generation comparisons, labeled as Examples 1-5, below.

Example 1 – 4 Generation – 3 Meiosis (DL)

A known cousin was compared up the tree on the relevant line through 4 generations. The relationship of the testers is shown in the chart above, with the blue arrows.

On the Curtis line, 4 individuals in descending generations were tested:

  • Child
  • Parent
  • Grandparent
  • Great-grandparent

In the Solomon line, one descendant was tested.

The results show the DNA segments that phased for 2, 3 and 4 generations, which is a total of 3 meiosis, meaning three times that the DNA was passed from generation to generation between the Great-grandparent and the Child.

The individual whose matches are tracked below is a third cousin to the Great-grandparent of the group. The relationship of the cousin to the descendants of the great-grandparent is shown below.

In reality, the distance of the cousin relationship isn’t really relevant. The relevant aspect is that the cousin DOES match all 4 relatives that tested, and we can track the segments that the cousin matches to the child, parent or grandparent back through the great-grandparent to see if they phase, meaning to see if the match is legitimate or not. In other words, was the segment passed from the Great-grandparent to the Grandparent to the Parent to the Child?

This first chart shows the cousin’s matches to all 4 of the family members. I’ve colored them green if they have phased matches, meaning adjacent generations on the same segment. In the comment column, I’ve explained what you are seeing.

This chart is a little more complex than previously, because we are dealing with 4 generations instead of 3. Therefore, I’m showing the cousin’s matches to all 4 individuals.

  • For a location to have no color and be labeled “No Phased Match” means that there was a match to one family member, but not to the adjacent generation upstream, so it’s not a genealogically relevant match. In other words, it’s a false match.
  • For a location to have no color and be labeled “Oldest Gen Only” means that the cousin matches the great-grandmother only. Those matches may be genealogically relevant, but because we don’t have a generation upstream of her, we can’t phase them and can’t tell if they are relevant or not based only on the information we have here. Obviously you’ll want to evaluate each match individually to see if it is a legitimate or false match using additional criteria.
  • For a location to be colored green, it must phase entirely for all the generations from where it begins upwards in the tree. For some matches, that means all 4 generations. Some matches that do phase only phase for 2 or 3 generations, meaning that the segment did not get passed on to younger generations. The two shades of green are only to differentiate the match groups when they are adjacent on the spreadsheet.
  • If the cell is green and says “4 Gen Match,” it means that the match appeared in all 4 generations and matched (or at least overlapped.)
  • If the cell is green and says “3 Gen Match,” it means that the match appeared in the oldest 3 generations and matched. The match did NOT appear in the child’s generation, so what we know about this segment is that it did not get passed to the child, but in the three generations in which it does appear, it phased.
  • If the cell is green and says “2 Gen Match,” it means that it appeared in the oldest two generations and phased, but did NOT get passed to the parent, so it could not have been passed to the child.
  • Matches to any single generation (but not the immediate upstream generation) are labeled “No Phased Match.”
  • If the cell is red and says “Lost Phasing” it means that the segment phased in at least two generations but did NOT match the adjacent generation upstream. Therefore, this is an example of a segment that did phase in one generation, but that was actually identical by chance (IBC) further upstream. In the case of the red segments above, they phased in all three of the younger generations, only to become irrelevant in the oldest generation when the tester did not match the Great-grandmother.

Now, looking at the same segment chart sorted by centiMorgan size.

Sorted by centiMorgan size gives you the opportunity to note that the larger segments are much more likely to phase, when given the opportunity. Translated, this means they are much more likely to be legitimate segments.

Formatted in the same way as the 3 generation groups, we see the following chart of only the segments, with the matches that were to the oldest generation only removed because they did not have the opportunity to phase. What we have below are the results for the matches that did have the opportunity to phase:

  • Green means the segment did phase
  • Red Means the segment did not phase and/or lost phasing.
  • White rows that did NOT phase are red above, along with rows that lost phasing.
  • White rows that are labeled “Oldest Gen Only” were removed because they are the oldest generation and did not have the opportunity to phase with an older generation.
  • For details, refer to the original charts, above.

Example 2 – 4 Generation – 3 Meiosis (CF-SV)

A second 4 generation comparison with a first cousin to the Great-grandmother results in more matches due to the closeness of the relationship, yielding additional information.

The 4 individuals in this and the following 3 examples are related in the following fashion:

Child 1 and Child 2 are siblings and Cousin 1 and Cousin 2 are siblings.

The two cousins are first cousins to the great-grandmother, so related to the matching individuals in the following fashion:

Because first cousins are significantly closer than third cousins, we have a lot more matching segments to work with.

It’s worth noting in the above chart that the two groups colored with gold in the right column both look like they phase, but when you look at the relationships of the people involved, you quickly realize that an intermediate generation is missing.

In the first example, the Grandparent and Great-grandmother do phase, but the child does not, because the cousin doesn’t also match the parent on that segment, so the parent could NOT have passed that segment to the child.  Therefore, the child does not phase.

In the second example, the cousin matches the Parent and Great-Grandmother, but the parent is missing in the match sequence, so these people don’t phase at all.

Sorted by centiMorgan size, we see the following.

Formatted by phased segment size, where red means did not phase or lost phasing and green means phased, we see the following pattern emerge.

Example 3 – 4 Generation – 3 Meiosis (CF-PV)

The next comparison is the still Cousin 1 but compared to Child 2.

In this case, three segments lost phasing when compared to older generations. They look like they phased when comparing the cousin to the Parent and Child, but we know they don’t because they don’t match the Grandparent, the next adjacent generation upstream.

Sorted by centiMorgan size, we see the following:

It’s interesting that all of the segments that lost phasing were quite small.

Formatted by segment size where red equals segments that did not phase or lost phasing and green equals segments that did phase.

Example 4 – 4 Generations – 3 Meiosis (DF-SV)

The fourth example utilizes Cousin 2 and Child 1.

In this comparison, no segments lost phasing, so there are no red segments.

Sorted by centiMorgan size, above and phased versus unphased segments, below.

Example 5 – 4 Generations – 3 Meiosis (DF-PV)

This last example utilizes the results of Cousin 2 matching to Child 2.

Again we have a group identified by gold in the last column that looks like a phased group if you’re just looking at the chromosome start and end locations, until you notice that the Grandparent is missing. The Parent and Child do share an overlapping segment mathematically, and it appears that this is part of the Great-grandmother’s segment, but it isn’t because the segment did not pass through the Grandparent. Of course, there is always a small possibility that there is a read issue with the grandparent’s file in this location, but as it stands, the parent and child’s matching segment loses phasing because it does not phase to the grandparent.

Again, three segments lost phasing.

Above, the spreadsheet sorted by centiMorgan value and below, by phased and unphased segments.

Side By Side Comparison

This side by side comparison shows the 5 different comparisons of 4 generations and 3 meiosis.

The pattern looks very similar and is almost identical in terms of the threshold to the original 3 generation study.  The 3 gen study thresholds varied from 2.46 to 3,16.  The largest 3 generation unphased segments were 3.36, 4.16, 4.75 and 6.05.

This suggests that your results with a 3 generation study are probably nearly just as reliable as a 4 generation study, although we did see one instance where phasing was lost after three matching generations. However, evaluating that match itself reveals that it was certainly highly questionable with the Parent carrying more of the “matching” segment to the Child than the Grandparent carried. While it was technically a 3 generation match before losing phasing, it wasn’t a solid match by any means.

With more test data, this could also mean that off-shifted matches or questionable matches are more likely to not phase or fail in higher generations.  I wrote here about methodologies for determining legitimate and false matches.

Discussion

I assembled a summary of the pertinent information from the five different 4 generation charts.

  • As expected, very small segments often did not phase. However, around the 3.5 cM region, they began to phase and reliably so. However, some larger segments, one as large as 7.13, did not phase.
  • It appears from the small number of segments that lost phasing that most of the time, if a segment does phase with the next generation upstream, it’s a valid segment and will continue to phase upwards.
  • Occasionally, phased segments are not valid and fail a “test” further up the tree. These are the segments that “lost phasing.”
  • The segments that did lose phasing were smaller segments with the largest at 3.68 cM.
  • Phasing, even in small segments, seems to be a relatively good predictor of a segment that is identical by descent, as determined by continuing to match ancestral segments on up the tree.

Of course, additional matches with cousins on the same segments would strengthen the argument as well, with or without phasing. Genetic genealogists are always looking for more information and ways to strengthen our evidence of connections with our cousins and family members. After all, that’s how we positively identify segments attributable to specific ancestors.

Testing Your Own Family

If you have either 3 or 4 individuals in descending generations, you can reproduce these same kinds of results for yourself. It’s actually easy and you can use the charts, methodology and color coding above as a guide.

You will need a relative that matches on the side of the oldest generation. In this case, the relatives were cousins of the great-grandmother. The relative will need to match the other two or three downstream people as well, meaning the direct descendants of the oldest relative. By copying the cousin’s entire match list from the Family Finder chromosome browser, you will be able to delete all matches other than to the people in your family group and compare the results using the same methodology I have shown.

If you don’t have access to the cousin’s match list, you can copy the matches to the cousin from the family member’s match lists and combine them into one spreadsheet.  The outcome is the same, but it’s easier if you have access to the cousin’s matches because you only have to download one file instead of 4.

What Can I Do With This Information?

Based on identifying segments as legitimate or false matches, you can label your DNA Master Spreadsheet with the information you’ve gleaned from the process. I’ve done that with just phasing to my mother. Studies such as this give me confidence that the larger phased segments with my mother are legitimate; even some segments below 5 cM and as low as 3.5 cM that DO phase.

These results and this article is NOT a suggestion that people should assume that ALL smaller segment matches are legitimate, because they aren’t. These studies are attempts to figure out HOW to discern which segments are valid and how to go about that process, including small segments. We now have three tools that can be utilized either together or individually:

  • Parental phasing
  • Multi-generation phasing, utilizing the parental phasing tools
  • Cousin Matching to phased segments, which is what we did in this article
  • Family Tree DNA‘s Family Phasing which in essence does this sort of matching for you, labeling your matches as to the side they descend from.

From the phasing information we’ve discovered, it appears that most segments below 3.5 cM aren’t going to phase and the majority are NOT legitimate matches.

This is a limited study.  Additional information could change and would certainly add to this information.

More is Better

As always, more data is always better.  Additional examples of results using this same phasing/cousin matching technique would allow quantification of the reliability of phased results as compared to unphased results.  In other words we know already that phased results are much better and more reliable than unphased results, but how much more and what are the functional limits of phased results?

There really is no question about the reliability of phased results in regard to larger segments, but additional information would help immensely in understanding how to successfully utilize smaller phased segments, in the range of 3.5 to 8 cM.

I would also suspect that in endogamous families, the thresholds observed here will move, probably with the phasing threshold moving even lower. People from fully endogamous cultures have many legitimate common small segments from sharing ancient ancestors. It would be interesting to observe the effects of endogamy on the observations made here.

I’m not Jewish and don’t have access to Jewish family information, but if several Jewish readers have tested multi-generational family and have a cousin from that side to test against, I would be glad to publish a followup article similar to this one with endogamous information.

It’s so exciting to be on the forefront of this wonderful genetic genealogy frontier together and to be able to experiment and learn.

I hope you use this methodology to explore, have fun and discover new information about your family.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research