DNA: In Search of…Signs of Endogamy

This is the fourth in our series of articles about searching for unknown close family members, specifically; parents, grandparents, or siblings. However, these same techniques can be applied by genealogists to ancestors further back in time as well.

In this article, we discuss endogamy – how to determine if you have it, from what population, and how to follow the road signs.

After introductions, we will be covering the following topics:

  • Pedigree collapse and endogamy
  • Endogamous groups
  • The challenge(s) of endogamy
  • Endogamy and unknown close relatives (parents, grandparents)
  • Ethnicity and Populations
  • Matches
  • AutoClusters
  • Endogamous Relationships
  • Endogamous DNA Segments
  • “Are Your Parents Related?” Tool
  • Surnames
  • Projects
  • Locations
  • Y DNA, Mitochondrial DNA, and Endogamy
  • Endogamy Tools Summary Tables
    • Summary of Endogamy Tools by Vendor
    • Summary of Endogamous Populations Identified by Each Tool
    • Summary of Tools to Assist People Seeking Unknown Parents and Grandparents

What Is Endogamy and Why Does It Matter?

Endogamy occurs when a group or population of people intermarry among themselves for an extended period of time, without the introduction of many or any people from outside of that population.

The effect of this continual intermarriage is that the founders’ DNA simply gets passed around and around, eventually in small segments.

That happens because there is no “other” DNA to draw from within the population. Knowing or determining that you have endogamy helps make sense of DNA matching patterns, and those patterns can lead you to unknown relatives, both close and distant.

This Article

This article serves two purposes.

  • This article is educational and relevant for all researchers. We discuss endogamy using multiple tools and examples from known endogamous people and populations.
  • In order to be able to discern endogamy when we don’t know who our parents or grandparents are, we need to know what signs and signals to look for, and why, which is based on what endogamy looks like in people who know their heritage.

There’s no crystal ball – no definitive “one-way” arrow, but there are a series of indications that suggest endogamy.

Depending on the endogamous population you’re dealing with, those signs aren’t always the same.

If you’re sighing now, I understand – but that’s exactly WHY I wrote this article.

We’re covering a lot of ground, but these road markers are invaluable diagnostic tools.

I’ve previously written about endogamy in the articles:

Let’s start with definitions.

Pedigree Collapse and Endogamy

Pedigree collapse isn’t the same as endogamy. Pedigree collapse is when you have ancestors that repeat in your tree.

In this example, the parents of our DNA tester are first cousins, which means the tester shares great-grandparents on both sides and, of course, the same ancestors from there on back in their tree.

This also means they share more of those ancestors’ DNA than they would normally share.

John Smith and Mary Johnson are both in the tree twice, in the same position as great-grandparents. Normally, Tester Smith would carry approximately 12.5% of each of his great-grandparents’ DNA, assuming for illustration purposes that exactly 50% of each ancestor’s DNA is passed in each generation. In this case, due to pedigree collapse, 25% of Tester Smith’s DNA descends from John Smith, and another 25% descends from Mary Johnson, double what it would normally be. 25% is the amount of DNA contribution normally inherited from grandparents, not great-grandparents.

While we may find first cousin marriages a bit eyebrow-raising today, they were quite common in the past. Both laws and customs varied with the country, time, social norms, and religion.

Pedigree Collapse and Endogamy is NOT the Same

You might think that pedigree collapse and endogamy is one and the same, but there’s a difference. Pedigree collapse can lead to endogamy, but it takes more than one instance of pedigree collapse to morph into endogamy within a population. Population is the key word for endogamy.

The main difference is that pedigree collapse occurs with known ancestors in more recent generations for one person, while endogamy is longer-term and systemic in a group of people.

Picture a group of people, all descended from Tester Smith’s great-grandparents intermarrying. Now you have the beginnings of endogamy. A couple hundred or a few hundred years later, you have true endogamy.

In other words, endogamy is pedigree collapse on a larger scale – think of a village or a church.

My ancestors’ village of Schnait, in Germany, is shown above in 1685. One church and maybe 30 or 40 homes. According to church and other records, the same families had inhabited this village, and region, for generations. It’s a sure bet that both pedigree collapse and endogamy existed in this small community.

If pedigree collapse happens over and over again because there are no other people within the community to marry, then you have endogamy. In other words, with endogamy, you assuredly DO have historical pedigree collapse, generally back in time, often before you can identify those specific ancestors – because everyone descends from the same set of founders.

Endogamy Doesn’t Necessarily Indicate Recent Pedigree Collapse

With deep, historic endogamy, you don’t necessarily have recent pedigree collapse, and in fact, many people do not. Jewish people are a good example of this phenomenon. They shared ancestors for hundreds or thousands of years, depending on which group we are referring to, but in recent, known, generations, many Jewish people aren’t related. Still, their DNA often matches each other.

The good news is that there are telltale signs and signals of endogamy.

The bad news is that not all of these are obvious, meaning as an aid to people seeking clues about unknown close relatives, and other “signs” aren’t what they are believed to be.

Let’s step through each endogamy identifier, or “hint,” and then we will review how we can best utilize this information.

First, let’s take a look at groups that are considered to be endogamous.

Endogamous Groups

Jewish PeopleSpecifically groups that were isolated from other groups of Jewish (and other) people; Ashkenazi (Germany, Northern France, and diaspora), Sephardic (Spanish, Iberia, and diaspora), Mizrahi (Israel, Middle Eastern, and diaspora,) Ethiopian Jews, and possibly Jews from other locations such as Mountain Jews from Kazakhstan and the Caucasus.

AcadiansDescendants of about 60 French families who settled in “Acadia” beginning about 1604, primarily on the island of Nova Scotia, and intermarried among themselves and with the Mi’kmaq people. Expelled by the English in 1755, they were scattered in groups to various diasporic regions where they continued to intermarry and where their descendants are found today. Some Acadians became the Cajuns of Louisiana.

Anabaptist Protestant FaithsAmish, Mennonite, and Brethren (Dunkards) and their offshoots are Protestant religious sects founded in Europe in the 14th, 15th, and 16th centuries on the principle of baptizing only adults or people who are old enough to choose to follow the faith, or rebaptizing people who had been previously baptized as children. These Anabaptist faiths tend to marry within their own group or church and often expel those who marry outside of the faith. Many emigrated to the American colonies and elsewhere, seeking religious freedom. Occasionally those groups would locate in close proximity and intermarry, but not marry outside of other Anabaptist denominations.

Native American (Indigenous) People – all indigenous peoples found in North and South America before European colonization descended from a small number of original founders who probably arrived at multiple times.

Indigenous Pacific Islanders – Including indigenous peoples of Australia, New Zealand, and Hawaii prior to colonization. They are probably equally as endogamous as Native American people, but I don’t have specific examples to share.

Villages – European or other villages with little inflow or whose residents were restricted from leaving over hundreds of years.

Other groups may have significant multiple lines of pedigree collapse and therefore become endogamous over time. Some people from Newfoundland, French Canadians, and Mormons (Church of Jesus Christ of Latter-Day Saints) come to mind.

Endogamy is a process that occurs over time.

Endogamy and Unknown Relatives

If you know who your relatives are, you may already know you’re from an endogamous population, but if you’re searching for close relatives, it’s helpful to be able to determine if you have endogamous heritage, at least in recent generations.

If you know nothing about either parent, some of these tools won’t help you, at least not initially, but others will. However, as you add to your knowledge base, the other tools will become more useful.

If you know the identity of one parent, this process becomes at least somewhat easier.

In future articles, we will search specifically for parents and each of your four grandparents. In this article, I’ll review each of the diagnostic tools and techniques you can use to determine if you have endogamy, and perhaps pinpoint the source.

The Challenge

People with endogamous heritage are related in multiple, unknown ways, over many generations. They may also be related in known ways in recent generations.

If both of your parents share the SAME endogamous culture or group of relatives:

  • You may have significantly more autosomal DNA matches than people without endogamy, unless that group of people is under-sampled. Jewish people have significantly more matches, but Native people have fewer due to under-sampling.
  • You may experience a higher-than-normal cM (centiMorgan) total for estimated relationships, especially more distant relationships, 3C and beyond.
  • You will have many matches related to you on both your maternal and paternal sides.
  • Parts of your autosomal DNA will be the same on both your mother’s and father’s sides, meaning your DNA will be fully identical in some locations. (I’ll explain more in a minute.)

If either (or both) of your parents are from an endogamous population, you:

  • Will, in some cases, carry identifying Y and mitochondrial DNA that points to a specific endogamous group. This is true for Native people, can be true for Jewish people and Pacific Islanders, but is not true for Anabaptist people.

One Size Does NOT Fit All

Please note that there is no “one size fits all.”

Each or any of these tools may provide relevant hints, depending on:

  • Your heritage
  • How many other people have tested from the relevant population group
  • How many close or distant relatives have tested
  • If your parents share the same heritage
  • Your unique DNA inheritance pattern
  • If your parents, individually, were fully endogamous or only partly endogamous, and how far back generationally that endogamy occurred

For example, in my own genealogy, my maternal grandmother’s father was Acadian on his father’s side. While I’m not fully endogamous, I have significantly more matches through that line proportionally than on my other lines.

I have Brethren endogamy on my mother’s side via her paternal grandmother.

Endogamous ancestors are shown with red stars on my mother’s pedigree chart, above. However, please note that her maternal and paternal endogamous ancestors are not from the same endogamous population.

However, I STILL have fewer matches on my mother’s side in total than on my father’s side because my mother has recent Dutch and recent German immigrants which reduces her total number of matches. Neither of those lines have had as much time to produce descendants in the US, and Europe is under-sampled when compared with the US where more people tend to take DNA tests because they are searching for where they came from.

My father’s ancestors have been in the US since it was a British Colony, and I have many more cousins who have tested on his side than mother’s.

If you looked at my pedigree chart and thought to yourself, “that’s messy,” you’d be right.

The “endogamy means more matches” axiom does not hold true for me, comparatively, between my parents – in part because my mother’s German and Dutch lines are such recent immigrants.

The number of matches alone isn’t going to tell this story.

We are going to need to look at several pieces and parts for more information. Let’s start with ethnicity.

Ethnicity and Populations

Ethnicity can be a double-edged sword. It can tell you exactly nothing you couldn’t discern by looking in the mirror, or, conversely, it can be a wealth of information.

Ethnicity reveals the parts of the world where your ancestors originated. When searching for recent ancestors, you’re most interested in majority ethnicity, meaning the 50% of your DNA that you received from each of your parents.

Ethnicity results at each vendor are easy to find and relatively easy to understand.

This individual at FamilyTreeDNA is 100% Ashkenazi Jewish.

If they were 50% Jewish, we could then estimate, and that’s an important word, that either one of their parents was fully Jewish, and not the other, or that two of their grandparents were Jewish, although not necessarily on the same side.

On the other hand, my mother’s ethnicity, shown below, has nothing remarkable that would point to any majority endogamous population, yet she has two.

The only hint of endogamy from ethnicity would be her ~1% Americas, and that isn’t relevant for finding close relatives. However, minority ancestry is very relevant for identifying Native ancestors, which I wrote about, here.

You can correlate or track your ethnicity segments to specific ancestors, which I discussed in the article, Native American & Minority Ancestors Identified Using DNAPainter Plus Ethnicity Segments, here.

Since I wrote that article, FamilyTreeDNA has added the feature of ethnicity or population Chromosome Painting, based on where each of your populations fall on your chromosomes.

In this example on chromosome 1, I have European ancestry (blue,) except for the pink Native segment, which occurs on the following segment in the same location on my mother’s chromosome 1 as well.

Both 23andMe, and FamilyTreeDNA provide chromosome painting AND the associated segment information so you can identify the relevant ancestors.

Ancestry is in the process of rolling out an ethnicity painting feature, BUT, it has no segment or associated matching information. While it’s interesting eye candy, it’s not terribly useful beyond the ethnicity information that Ancestry already provides. However, Jonny Perl at DNAPainter has devised a way to estimate Ancestry’s start and stop locations, here. Way to go Jonny!

Now all you need to do is convince your Ancestry matches to upload their DNA file to one of the three databases, FamilyTreeDNA, MyHeritage, and GEDMatch, that accept transfers, aka uploads. This allows matching with segment data so that you can identify who matches you on that segment, track your ancestors, and paint your ancestral segments at DNAPainter.

I provided step-by-step instructions, here, for downloading your raw DNA file from each vendor in order to upload the file to another vendor.

Ethnicity Sides

Three of the four DNA testing vendors, 23andMe, FamilyTreeDNA, and recently, Ancestry, attempt to phase your ethnicity DNA, meaning to assign it to one parental “side” or the other – both in total and on each chromosome.

Here’s Ancestry’s SideView, where your DNA is estimated to belong to parent 1 and parent 2. I detailed how to determine which side is which, here, and while that article was written specifically pertaining to Ancestry’s SideView, the technique is relevant for all the vendors who attempt to divide your DNA into parents, a technique known as phasing.

I say “attempt” because phasing may or may not be accurate, meaning the top chromosome may not always be parent 1, and the bottom chromosome may not always be chromosome 2.

Here’s an example at 23andMe.

See the two yellow segments. They are both assigned as Native. I happen to know one is from the mother and one is from the father, yet they are both displayed on the “top” chromosome, which one would interpret to be the same parent.

I am absolutely positive this is not the case because this is a close family member, and I have the DNA of the parent who contributed the Native segment on chromosome 1, on the top chromosome. That parent does not have a Native segment on chromosome 2 to contribute. So that Native segment had to be contributed by the other parent, but it’s also shown on the top chromosome.

The DNA segments circled in purple belong together on the same “side” and were contributed to the tester by the same parent. The Native segment on chromosome 2 abuts a purple African segment, suggesting perhaps that the ancestor who contributed that segment was mixed between those ethnicities. In the US, that suggests enslavement.

The other African segments, circled, are shown on the second chromosome in each pair.

To be clear, parent 1 is not assigned by the vendors to either mother or father and will differ by person. Your parent 1, or the parent on the top chromosome may be your mother and another person’s parent 1 may be their father.

As shown in this example, parents can vary by chromosome, a phenomenon known as “strand swap.” Occasionally, the DNA can even be swapped within a chromosome assignment.

You can, however, get an idea of the division of your DNA at any specific location. As shown above, you can only have a maximum of two populations of DNA on any one chromosome location.

In our example above, this person’s majority ancestry is European (blue.) On each chromosome where we find a minority segment, the opposite chromosome in the same location is European, meaning blue.

Let’s look at another example.

At FamilyTreeDNA, the person whose ethnicity painting is shown below has a Native American (pink) ancestor on their father’s side. FamilyTreeDNA has correctly phased or identified their Native segments as all belonging to the second chromosome in each pair.

Looking at chromosome 18, for example, most of their father’s chromosome is Native American (pink). The other parent’s chromosome is European (dark blue) at those same locations.

If one of the parents was of one ethnicity, and the other parent is a completely different ethnicity, then one bar of each chromosome would be all pink, for example, and one would be entirely blue, representing the other ethnicity.

Phasing ethnicity or populations to maternal and paternal sides is not foolproof, and each chromosome is phased individually.

Ethnicity can, in some cases, give you a really good idea of what you’re dealing with in terms of heritage and endogamy.

If someone had an Ashkenazi Jewish father and European mother, for example, one copy of each chromosome would be yellow (Ashkenazi Jewish), and one would be blue (European.)

However, if each of their parents were half European Jewish and half European (not Jewish), then their different colored segments would be scattered across their entire set of chromosomes.

In this case, both of the tester’s parents are mixed – European Jewish (green) and Western Europe (blue.) We know both parents are admixed from the same two populations because in some locations, both parents contributed blue (Western Europe), and in other locations, both contributed Jewish (green) segments.

Both MyHeritage and Ancestry provide a secondary tool that’s connected to ethnicity, but different and generally in more recent times.

Ancestry’s DNA Communities

While your ethnicity may not point to anything terribly exciting in terms of endogamy, Genetic Communities might. Ancestry says that a DNA Community is a group of people who share DNA because their relatives recently lived in the same place at the same time, and that communities are much smaller than ethnicity regions and reach back only about 50-300 years.

Based on the ancestors’ locations in the trees of me and my matches, Ancestry has determined that I’m connected to two communities. In my case, the blue group is clearly my father’s line. The orange group could be either parent, or even a combination of both.

My endogamous Brethren could be showing up in Maryland, Pennsylvania, and Ohio, but it’s uncertain, in part, because my father’s ancestral lines are found in Virginia, West Virginia, and Maryland too.

These aren’t useful for me, but they may be more useful for fully endogamous people, especially in conjunction with ethnicity.

My Acadian cousin’s European ethnicity isn’t informative.

However, viewing his DNA Communities puts his French heritage into perspective, especially combined with his match surnames.

I wrote about DNA Communities when it was introduced with the name Genetic Communities, here.

MyHeritage’s Genetic Groups

MyHeritage also provides a similar feature that shows where my matches’ ancestors lived in the same locations as mine.

One difference, though, is that testers can adjust their ethnicity results confidence level from high, above, to low, below where one of my Genetic Groups overlaps my ethnicity in the Netherlands.

You can also sort your matches by Genetic Groups.

The results show you not only who is in the group, but how many of your matches are in that group too, which provides perspective.

I wrote about Genetic Groups, here.

Next, let’s look at how endogamy affects your matches.

Matches

The number of matches that a person has who is from an entirely endogamous community and a person with no endogamy may be quite different.

FamilyTreeDNA provides a Family Matching feature that triangulates your matches and assigns them to your paternal or maternal side by using known matches that you have linked to their profile cards in your tree. You must link people for the Family Matching feature known as “bucketing” to be enabled.

The people you link are then processed for shared matches on the same chromosome segment(s). Triangulated individuals are then deposited in your maternal, paternal, and both buckets.

Obviously, your two parents are the best people to link, but if they haven’t tested (or uploaded their DNA file from another vendor) and you have other known relatives, link them using the Family Tree tab at the top of your personal page.

I uploaded my Ancestry V4 kit to use as an example for linking. Let’s pretend that’s my sister. If I had not already linked my Ancestry V4 kit to “my sister’s” profile card, I’d want to do that and link other known individuals the same way. Just drag and drop the match to the correct profile card.

Note that a full or half sibling will be listed as such at FamilyTreeDNA, but an identical twin will show as a potential parent/child match to you. You’re much more likely to find a parent than an identical twin, but just be aware.

I’ve created a table of FamilyTreeDNA bucketed match results, by category, comparing the number of matches in endogamous categories with non-endogamous.

Total Matches Maternal Matches Paternal Matches Both % Both % DNA Unassigned
100% Jewish 34,637 11,329 10,416 4,806 13.9 23.3
100% Jewish 32,973 10,700 9,858 4,606 14 23.7
100% Jewish 32,255 9,060 10,970 3,892 12 25.8
75% Jewish 24,232 11,846 Only mother linked Only mother linked Only mother linked
100% Acadian 8093 3826 2299 1062 13 11
100% Acadian 7828 3763 1825 923 11.8 17
Not Endogamous 6760 3845 1909 13 0.19 14.5
Not Endogamous 7723 1470 3317 6 0.08 38
100% Native American 1,115 Unlinked Unlinked Unlinked
100% Native American 885 290 Unknown Can’t calculate without at least one link on both sides

The 100% Jewish, Acadian, and Not Endogamous testers both have linked their parents, so their matches, if valid (meaning not identical by chance, which I discussed here,) will match them plus one or the other parent.

One person is 75% Jewish and has only linked their Jewish mother.

The Native people have not tested their parents, and the first Native person has not linked anyone in their tree. The second Native person has only linked a few maternal matches, but their mother has not tested. They are seeking their father.

It’s very difficult to find people who are fully Native as testers. Furthermore, Native people are under-sampled. If anyone knows of fully Native (or other endogamous) people who have tested and linked their parents or known relatives in their trees, and will allow me to use their total match numbers anonymously, please let me know.

As you can see, Jewish, Acadian, and Native people are 100% endogamous, but many more Jewish people than Native people have tested, so you CAN’T judge endogamy by the total number of matches alone.

In fact, in order:

  • Fully Jewish testers have about 4-5 times as many matches as the Acadian and Non-endogamous testers
  • Acadian and Non-endogamous testers have about 5-6 times as many matches as the Native American testers
  • Fully Jewish people have about 30 times more matches than the Native American testers

If a person’s endogamy with a particular population is only on their maternal or paternal side, they won’t have a significant number of people related to both sides, meaning few people will fall into the “Both” bucket. People that will always be found in the ”Both” bucket are full siblings and their descendants, along with descendants of the tester, assuming their match is linked to their profiles in the tester’s tree.

In the case of our Jewish testers, you can easily see that the “Both” bucket is very high. The Acadians are also higher than one would reasonably expect without endogamy. A non-endogamous person might have a few matches on both sides, assuming the parents are not related to each other.

A high number of “Both” matches is a very good indicator of endogamy within the same population on both parents’ sides.

The percentage of people who are assigned to the “Both” bucket is between 11% and 14% in the endogamous groups, and less than 1% in the non-endogamous group, so statistically not relevant.

As demonstrated by the Native people compared to the Jewish testers, the total number of matches can be deceiving.

However, being related to both parents, as indicated by the “Both” bucket, unless you have pedigree collapse, is a good indicator of endogamy.

Of course, if you don’t know who your relatives are, you can’t link them in your tree, so this type of “hunt” won’t generally help people seeking their close family members.

However, you may notice that you’re matching people PLUS both of their parents. If that’s the case, start asking questions of those matches about their heritage.

A very high number of total matches, as compared to non-endogamous people, combined with some other hints might well point to Jewish heritage.

I included the % DNA Unassigned category because this category, when both parents are linked, is the percentage of matches by chance, meaning the match doesn’t match either of the tester’s parents. All of the people with people listed in “Both” categories have linked both of their parents, not just maternal and paternal relatives.

Matching Location at MyHeritage

MyHeritage provides a matching function by location. Please note that it’s the location of the tester, but that may still be quite useful.

The locations are shown in the most-matches to least-matches order. Clicking on the location shows the people who match you who are from that location. This would be the most useful in situations where recent immigration has occurred. In my case, my great-grandfather from the Netherlands arrived in the 1860s, and my German ancestors arrived in the 1850s. Neither of those groups are endogamous, though, unless it would be on a village level.

AutoClusters

Let’s shift to Genetic Affairs, a third-party tool available to everyone.

Using their AutoCluster function, Genetic Affairs clusters your matches together who match both each other and you.

This is an example of the first few clusters in my AutoCluster. You can see that I have several colored clusters of various sizes, but none are huge.

Compare that to the following endogamous cluster, sample courtesy of EJ Blom at Genetic Affairs.

If your AutoCluster at Genetic Affairs looks something like this, a huge orange blob in the upper left hand corner, you’re dealing with endogamy.

Please also note that the size of your cluster is also a function of both the number of testers and the match threshold you select. I always begin by using the defaults. I wrote about using Genetic Affairs, here.

If you tested at or transferred to MyHeritage, they too license AutoClusters, but have optimized the algorithm to tease out endogamous matches so that their Jewish customers, in particular, don’t wind up with a huge orange block of interrelated people.

You won’t see the “endogamy signature” huge cluster in the corner, so you’re less likely to be able to discern endogamy from a MyHeritage cluster alone.

The commonality between these Jewish clusters at MyHeritage is that they all tend to be rather uniform in size and small, with lots of grey connecting almost all the blocks.

Grey cells indicate people who match people in two colored groups. In other words, there is often no clear division in clusters between the mother’s side and the father’s side in Jewish clusters.

In non-endogamous situations, even if you can’t identify the parents, the clusters should still fall into two sides, meaning a group of clusters for each parent’s side that are not related to each other.

You can read more about Genetic Affairs clusters and their tools, here. DNAGedcom.com also provides a clustering tool.

Endogamous Relationships

Endogamous estimated relationships are sometimes high. Please note the word, “sometimes.”

Using the Shared cM Project tool relationship chart, here, at DNAPainter, people with heavy endogamy will discover that estimated relationships MAY be on the high side, or the relationships may, perhaps, be estimated too “close” in time. That’s especially true for more distant relationships, but surprisingly, it’s not always true. The randomness of inheritance still comes into play, and so do potential unknown relatives. Hence, the words “may” are bolded and underscored.

Unfortunately, it’s often stated as “conventional wisdom” that Jewish matches are “always” high, and first cousins appear as siblings. Let’s see what the actual data says.

At DNAPainter, you can either enter the amount of shared DNA (cM), or the percent of shared DNA, or just use the chart provided.

I’ve assembled a compilation of close relationships in kits that I have access to or from people who were generous enough to share their results for this article.

I’ve used Jewish results, which is a highly endogamous population, compared with non-endogamous testers.

The “Jewish Actual” column reports the total amount of shared DNA with that person. In other words, someone to their grandparent. The Average Range is the average plus the range from DNAPainter. The Percent Difference is the % difference between the actual number and the DNAPainter average.

You’ll see fully Jewish testers, at left, matching with their family members, and a Non-endogamous person, at right, matching with their same relative.

Relationship Jewish Actual Percent Difference than Average Average -Range Non-endogamous Actual Percent Difference than Average
Grandparent 2141 22 1754 (984-2482) 1742 <1 lower
Grandparent 1902 8.5 1754 (984-2482) 1973 12
Sibling 3039 16 2613 (1613-3488) 2515 3.5 lower
Sibling 2724 4 2613 (1613-3488) 2761 5.5
Half-Sibling 2184 24 1759 (1160-2436) 2127 21
Half-Sibling 2128 21 1759 (1160-2436) 2352 34
Aunt/Uncle 2066 18.5 1741 (1201-2282) 1849 6
Aunt/Uncle 2031 16.5 1741 (1201-2282) 2097 20
1C 1119 29 866 (396-1397) 959 11
1C 909 5 866 (396-1397) 789 9 lower
1C1R 514 19 433 (102-980) 467 8
1C1R 459 6 433 (102-980) 395 9 lower

These totals are from FamilyTreeDNA except one from GEDMatch (one Jewish Half-sibling).

Totals may vary by vendor, even when matching with the same person. 23andMe includes the X segments in the total cMs and also counts fully identical segments twice. MyHeritage imputation seems to err on the generous side.

However, in these dozen examples:

  • You can see that the Jewish actual amount of DNA shared is always more than the average in the estimate.
  • The red means the overage is more than 100 cM larger.
  • The percentage difference is probably more meaningful because 100 cM is a smaller percentage of a 1754 grandparent connection than compared to a 433 cM 1C1R.

However, you can’t tell anything about endogamy by just looking at any one sample, because:

  • Some of the Non-Endogamous matches are high too. That’s just the way of random inheritance.
  • All of the actual Jewish match numbers are within the published ranges, but on the high side.

Furthermore, it can get more complex.

Half Endogamous

I requested assistance from Jewish genealogy researchers, and a lovely lady, Sharon, reached out, compiled her segment information, and shared it with me, granting permission to share with you. A HUGE thank you to Sharon!

Sharon is half-Jewish via one parent, and her half-sibling is fully Jewish. Their half-sibling match to each other at Ancestry is 1756 cM with a longest segment of 164 cM.

How does Jewish matching vary if you’re half-Jewish versus fully Jewish? Let’s look at 21 people who match both Sharon and her fully Jewish half-sibling.

Sharon shared the differences in 21 known Jewish matches with her and her half-sibling. I’ve added the Relationship Estimate Range from DNAPainter and colorized the highest of the two matches in yellow. Bolding in the total cM column shows a value above the average range for that relationship.

Total Matching cMs is on the left, with Longest Segment on the right.

While this is clearly not a scientific study, it is a representative sample.

The fully Jewish sibling carries more Jewish DNA, which is available for other Jewish matches to match as a function of endogamy (identical by chance/population), so I would have expected the fully Jewish sibling to match most if not all Jewish testers at a higher level than the half-Jewish sibling.

However, that’s not universally what we see.

The fully Jewish sibling is not always the sibling with the highest number of matches to the other Jewish testers, although the half-Jewish tester has the larger “Longest Segment” more often than not.

Approximately two-thirds of the time (13/21), the fully Jewish person does have a higher total matching cM, but about one-third of the time (8/21), the half-Jewish sibling has a higher matching cM.

About one-fourth of the time (5/21), the fully Jewish sibling has the longest matching segment, and about two-thirds of the time (13/21), the half-Jewish sibling does. In three cases, or about 14% of the time, the longest segment is equal which may indicate that it’s the same segment.

Because of endogamy, Jewish matches are more likely to have:

  • Larger than average total cM for the specific relationship
  • More and smaller matching segments

However, as we have seen, neither of those are definitive, nor always true. Jewish matches and relationships are not always overestimated.

Ancestry and Timber

Please note that Ancestry downweights some matches by removing some segments using their Timber algorithm. Based on my matches and other accounts that I manage, Ancestry does not downweight in the 2-3rd cousin category, which is 90 cM and above, but they do begin downweighting in the 3-4th cousin category, below 90 cM, where my “Extended Family” category begins.

If you’ve tested at Ancestry, you can check for yourself.

By clicking on the amount of DNA you share with your match on your match list at Ancestry, shown above, you will be taken to another page where you will be able to view the unweighted shared DNA with that match, meaning the amount of DNA shared before the downweighting and removal of some segments, shown below.

Given the downweighting, and the information in the spreadsheet provided by Sharon, it doesn’t appear that any of those matches would have been in a category to be downweighted.

Therefore, for these and other close matches, Timber wouldn’t be a factor, but would potentially be in more distant matches.

Endogamous Segments

Endogamous matches tend to have smaller and more segments. Small amounts of matching DNA tend to skew the total DNA cM upwards.

How and why does this happen?

Ancestral DNA from further back in time tends to be broken into smaller segments.

Sometimes, especially in endogamous situations, two smaller segments, at one time separated from each other, manage to join back together again and form a match, but the match is only due to ancestral segments – not because of a recent ancestor.

Please note that different vendors have different minimum matching cM thresholds, so smaller matches may not be available at all vendors. Remember that factors like Timber and imputation can affect matching as well.

Let’s take a look at an example. I’ve created a chart where two ancestors have their blue and pink DNA broken into 4 cM segments.

They have children, a blue child and a pink child, and the two children, shown above, each inherited the same blue 4 cM segment and the same pink 4 cM segment from their respective parents. The other unlabeled pink and blue segments are not inherited by these two children, so those unlabeled segments are irrelevant in this example.

The parents may have had other children who inherited those same 4 cM labeled pink and blue segments as well, and if not, the parents’ siblings were probably passing at least some of the same DNA down to their descendants too.

The blue and pink children had children, and their children had children – for several generations.

Time passed, and their descendants became an endogamous community. Those pink and blue 4 cM segments may at some time be lost during recombination in the descendants of each of their children, shown by “Lost pink” and “Lost blue.”

However, because there is only a very limited amount of DNA within the endogamous community, their descendants may regain those same segments again from their “other parent” during recombination, downstream.

In each generation, the DNA of the descendant carrying the original blue or pink DNA segment is recombined with their partner. Given that the partners are both members of the same endogamous community, the two people may have the same pink and/or blue DNA segments. If one parent doesn’t carry the pink 4 cM segment, for example, their offspring may receive that ancestral pink segment from the other parent.

They could potentially, and sometimes do, receive that ancestral segment from both parents.

In our example, the descendants of the blue child, at left, lost the pink 4 cM segment in generation 3, but a few generations later, in generation 11, that descendant child inherited that same pink 4 cM segment from their other parent. Therefore, both the 4 cM blue and 4 cM pink segments are now available to be inherited by the descendants in that line. I’ve shown the opposite scenario in the generational inheritance at right where the blue segment is lost and regained.

Once rejoined, that pink and blue segment can be passed along together for generations.

The important part, though, is that once those two segments butt up against each other again during recombination, they aren’t just two separate 4 cM segments, but one segment that is 8 cM long – that is now equal to or above the vendors’ matching threshold.

This is why people descended from endogamous populations often have the following matching characteristics:

  • More matches
  • Many smaller segment matches
  • Their total cM is often broken into more, smaller segments

What does more, smaller segments, look like, exactly?

More, Smaller Segments

All of our vendors except Ancestry have a chromosome browser for their customers to compare their DNA to that of their matches visually.

Let’s take a look at some examples of what endogamous and non-endogamous matches look like.

For example, here’s a screen shot of a random Jewish second cousin match – 298 cM total, divided into 12 segments, with a longest segment of 58 cM,

A second Jewish 2C with 323 cM total, across 19 segments, with a 69 cM longest block.

A fully Acadian 2C match with 600 cM total, across 27 segments, with a longest segment of 69 cM.

A second Acadian 2C with 332 cM total, across 20 segments, with a longest segment of 42 cM.

Next, a non-endogamous 2C match with 217 cM, across 7 segments, with a longest segment of 72 cM.

Here’s another non-endogamous 2C example, with 169 shared cM, across 6 segments, with a longest segment of 70 cM.

Here’s the second cousin data in a summary table. The take-away from this is the proportion of total segments

Tester Population Total cM Longest Block Total Segments
Jewish 2C 298 58 12
Jewish 2C 323 69 19
Acadian 2C 600 69 27
Acadian 2C 332 42 20
Non-endogamous 2C 217 72 7
Non-endogamous 2C 169 70 6

You can see more examples and comparisons between Native American, Jewish and non-endogamous DNA individuals in the article, Concepts – Endogamy and DNA Segments.

I suspect that a savvy mathematician could predict endogamy based on longest block and total segment information.

Lara Diamond, a mathematician, who writes at Lara’s Jewnealogy might be up for this challenge. She just published compiled matching and segment information in her Ashkenazic Shared DNA Survey Results for those who are interested. You can also contribute to Laura’s data, here.

Endogamy, Segments, and Distant Relationships

While not relevant to searching for close relatives, heavily endogamous matches 3C and more distant, to quote one of my Jewish friends, “dissolve into a quagmire of endogamy and are exceedingly difficult to unravel.”

In my own Acadian endogamous line, I often simply have to label them “Acadian” because the DNA tracks back to so many ancestors in different lines. In other words, I can’t tell which ancestor the match is actually pointing to because the same DNA segments or segments is/are carried by several ancestors and their descendants due to founder effect.

The difference with the Acadians is that we can actually identify many or most of them, at least at some point in time. As my cousin, Paul LeBlanc, once said, if you’re related to one Acadian, you’re related to all Acadians. Then he proceeded to tell me that he and I are related 137 different ways. My head hurts!

It’s no wonder that endogamy is incredibly difficult beyond the first few generations when it turns into something like multi-colored jello soup.

“Are Your Parents Related?” Tool

There’s another tool that you can utilize to determine if your parents are related to each other.

To determine if your parents are related to each other, you need to know about ROH, or Runs of Homozygosity (ROH).

ROH means that the DNA on both strands or copies of the same chromosome is identical.

For a few locations in a row, ROH can easily happen just by chance, but the longer the segment, the less likely that commonality occurs simply by chance.

The good news is that you don’t need to know the identity of either of your parents. You don’t need either of your parent’s DNA tests – just your own. You’ll need to upload your DNA file to GEDmatch, which is free.

Click on “Are your parents related?”

GEDMatch analyzes your DNA to see if any of your DNA, above a reasonable matching threshold, is identical on both strands, indicating that you inherited the exact same DNA from both of your parents.

A legitimate match, meaning one that’s not by chance, will include many contiguous matching locations, generally a minimum of 500 SNPs or locations in a row. GEDmatch’s minimum threshold for identifying identical ancestral DNA (ROH) is 200 cM.

Here’s my result, including the graphic for the first two chromosomes. Notice the tiny green bars that show identical by chance tiny sliver segments.

I have no significant identical DNA, meaning my parents are not related to each other.

Next, let’s look at an endogamous example where there are small, completely identical segments across a person’s chromosome

This person’s Acadian parents are related to each other, but distantly.

Next, let’s look at a Jewish person’s results.

You’ll notice larger green matching ROH, but not over 200 contiguous SNPs and 7 cM.

GEDMatch reports that this Jewish person’s parents are probably not related within recent generations, but it’s clear that they do share DNA in common.

People whose parents are distantly related have relatively small, scattered matching segments. However, if you’re seeing larger ROH segments that would be large enough to match in a genealogical setting, meaning multiple greater than 7 cM and 500 SNPs,, you may be dealing with a different type of situation where cousins have married in recent generations. The larger the matching segments, generally, the closer in time.

Blogger Kitty Cooper wrote an article, here, about discovering that your parents are related at the first cousin level, and what their GEDMatch “Are Your Parents Related” results look like.

Let’s look for more clues.

Surnames

There MAY be an endogamy clue in the surnames of the people you match.

Viewing surnames is easier if you download your match list, which you can do at every vendor except Ancestry. I’m not referring to the segment data, but the information about your matches themselves.

I provided instructions in the recent article, How to Download Your DNA Match Lists and Segment Files, here.

If you suspect endogamy for any reason, look at your closest matches and see if there is a discernable trend in the surnames, or locations, or any commonality between your matches to each other.

For example, Jewish, Acadian, and Native surnames may be recognizable, as may locations.

You can evaluate in either or both of two ways:

  • The surnames of your closest matches. Closest matches listed first will be your default match order.
  • Your most frequently occurring surnames, minus extremely common names like Smith, Jones, etc., unless they are also in your closest matches. To utilize this type of matching, sort the spreadsheet in surname order and then scan or count the number of people with each surname.

Here are some examples from our testers.

Jewish – Closest surname matches.

  • Roth
  • Weiss
  • Goldman
  • Schonwald
  • Levi
  • Cohen
  • Slavin
  • Goodman
  • Sender
  • Trebatch

Acadian – Closest surname matches.

  • Bergeron
  • Hebert
  • Bergeron
  • Marcum
  • Muise
  • Legere
  • Gaudet
  • Perry
  • Verlander
  • Trombley

Native American – Closest surname matches.

  • Ortega
  • Begay
  • Valentine
  • Hayes
  • Montoya
  • Sun Bear
  • Martin
  • Tsosie
  • Chiquito
  • Yazzie

You may recognize these categories of surnames immediately.

If not, Google is your friend. Eliminate common surnames, then Google for a few together at a time and see what emerges.

The most unusual surnames are likely your best bets.

Projects

Another way to get some idea of what groups people with these surnames might belong to is to enter the surname in the FamilyTreeDNA surname search.

Go to the main FamilyTreeDNA page, but DO NOT sign on.

Scroll down until you see this image.

Type the surname into the search box. You’ll see how many people have tested with that surname, along with projects where project administrators have included that surname indicating that the project may be of interest to at least some people with that surname.

Here’s a portion of the project list for Cohen, a traditional Jewish surname.

These results are for Muise, an Acadian surname.

Clicking through to relevant surname projects, and potentially contacting the volunteer project administrator can go a very long way in helping you gather and sift information. Clearly, they have an interest in this topic.

For example, here’s the Muise surname in the Acadian AmerIndian project. Two great hints here – Acadian heritage and Halifax, Nova Scotia.

Repeat for the balance of surnames on your list to look for commonalities, including locations on the public project pages.

Locations

Some of the vendor match files include location information. Each person on your match list will have the opportunity at the vendor where they tested to include location information in a variety of ways, either for their ancestors or themselves.

Where possible, it’s easiest to sort or scan the download file for this type of information.

Ancestry does not provide or facilitate a match list, but you can still create your own for your closest 20 or 30 matches in a spreadsheet.

MyHeritage provides common surname and ancestral location information for every match. How cool is that!

Y DNA, Mitochondrial DNA, and Endogamy

Haplogroups for both Y and mitochondrial DNA can indicate and sometimes confirm endogamy. In other cases, the haplogroup won’t help, but the matches and their location information just might.

FamilyTreeDNA is the only vendor that provides Y DNA and mitochondrial DNA tests that include highly granular haplogroups along with matches and additional tools.

23andMe provides high-level haplogroups which may or may not be adequate to pinpoint a haplogroup that indicates endogamy.

Of course, only males carry Y DNA that tracks to the direct paternal (surname) line, but everyone carries their mother’s mitochondrial DNA that represents their mother’s mother’s mother’s, or direct matrilineal line.

Some haplogroups are known to be closely associated with particular ethnicities or populations, like Native Americans, Pacific Islanders, and some Jewish people.

Haplogroups reach back in time before genealogy and can give us a sense of community that’s not available by either looking in the mirror or through traditional records.

This Native American man is a member of high-level haplogroup Q-M242. However, some men who carry this haplogroup are not Native, but are of European or Middle Eastern origin.

I entered the haplogroup in the FamilyTreeDNA Discover tool, which I wrote about, here.

Checking the information about this haplogroup reveals that their common ancestor descended from an Asian man about 30,000 years ago.

The migration path in the Americans explains why this person would have an endogamous heritage.

Our tester would receive a much more refined haplogroup if he upgraded to the Big Y test at FamilyTreeDNA, which would remove all doubt.

However, even without additional testing, information about his matches at FamilyTreeDNA may be very illuminating.

The Q-M242 Native man’s Y DNA matches men with more granular haplogroups, shown above, at left. On the Haplogroup Origins report, you can see that these people have all selected the “US (Native American)” country option.

Another useful tool would be to check the public Y haplotree, here, and the public mitochondrial tree here, for self-reported ancestor location information for a specific haplogroup.

Here’s an example of mitochondrial haplogroup A2 and a few subclades on the public mitochondrial tree. You can see that the haplogroup is found in Mexico, the US (Native,) Canada, and many additional Caribbean, South, and Central American countries.

Of course, Y DNA and mitochondrial DNA (mtDNA) tell a laser-focused story of one specific line, each. The great news, if you’re seeking information about your mother or father, the Y is your father’s direct paternal (surname) line, and mitochondrial is your mother’s direct matrilineal line.

Y and mitochondrial DNA results combined with ethnicity, autosomal matching, and the wide range of other tools that open doors, you will be able to reveal a great deal of information about whether you have endogamous heritage or not – and if so, from where.

I’ve provided a resource for stepping through and interpreting your Y DNA results, here, and mitochondrial DNA, here.

Discover for Y DNA Only

If you’re a female, you may feel left out of Y DNA testing and what it can tell you about your heritage. However, there’s a back door.

You can utilize the Y DNA haplogroups of your closest autosomal matches at both FamilyTreeDNA and 23andMe to reveal information

Haplogroup information is available in the download files for both vendors, in addition to the Family Finder table view, below, at FamilyTreeDNA, or on your individual matches profile cards at both 23andMe and FamilyTreeDNA.

You can enter any Y DNA haplogroup in the FamilyTreeDNA Discover tool, here.

You’ll be treated to:

  • Your Haplogroup Story – how many testers have this haplogroup (so far), where the haplogroup is from, and the haplogroup’s age. In this case, the haplogroup was born in the Netherlands about 250 years ago, give or take 200 years. I know that it was 1806 or earlier based on the common ancestor of the men who tested.
  • Country Frequency – heat map of where the haplogroup is found in the world.
  • Notable Connections – famous and infamous (this haplogroup’s closest notable person is Leo Tolstoy).
  • Migration Map – migration path out of Africa and through the rest of the world.
  • Ancient Connections – ancient burials. His closest ancient match is from about 1000 years ago in Ukraine. Their shared ancestor lived about 2000 years ago.
  • Suggested Projects – based on the surname, projects that other matches have joined, and haplogroups.
  • Scientific Details – age estimates, confidence intervals, graphs, and the mutations that define this haplogroup.

I wrote about the Discover tool in the article, FamilyTreeDNA DISCOVER Launches – Including Y DNA Haplogroup Ages.

Endogamy Tools Summary Tables

Endogamy is a tough nut sometimes, especially if you’re starting from scratch. In order to make this topic a bit easier and to create a reference tool for you, I’ve created three summary tables.

  • Various endogamy-related tools available at each vendor which will or may assist with evaluating endogamy
  • Tools and their ability to detect endogamy in different groups
  • Tools best suited to assist people seeking information about unknown parents or grandparents

Summary of Endogamy Tools by Vendor

Please note that GEDMatch is not a DNA testing vendor, but they accept uploads and do have some tools that the testing vendors do not.

 Tool 23andMe Ancestry FamilyTreeDNA MyHeritage GEDMatch
Ethnicity Yes Yes Yes Yes Use the vendors
Ethnicity Painting Yes + segments Yes, limited Yes + segments Yes
Ethnicity Phasing Yes Partial Yes No
DNA Communities No Yes No No
Genetic Groups No No No Yes
Family Matching aka Bucketing No No Yes No
Chromosome Browser Yes No Yes Yes Yes
AutoClusters Through Genetic Affairs No Through Genetic Affairs Yes, included Yes, with subscription
Match List Download Yes, restricted # of matches No Yes Yes Yes
Projects No No Yes No
Y DNA High-level haplogroup only No Yes, full haplogroup with Big Y, matching, tools, Discover No
Mitochondrial DNA High-level haplogroup only No Yes, full haplogroup with mtFull, matching, tools No
Public Y Tree No No Yes No
Public Mito Tree No No Yes No
Discover Y DNA – public No No Yes No
ROH No No No No Yes

Summary of Endogamous Populations Identified by Each Tool

The following chart provides a guideline for which tools are useful for the following types of endogamous groups. Bolded tools require that both parents be descended from the same endogamous group, but several other tools give more definitive results with higher amounts of endogamy.

Y and mitochondrial DNA testing are not affected by admixture, autosomal DNA or anything from the “other” parent.

Tool Jewish Acadian Anabaptist Native Other/General
Ethnicity Yes No No Yes Pacific Islander
Ethnicity Painting Yes No No Yes Pacific Islander
Ethnicity Phasing Yes, if different No No Yes, if different Pacific Islander, if different
DNA Communities Yes Possibly Possibly Yes Pacific Islander
Genetic Groups Yes Possibly Possibly Yes Pacific Islander
Family Matching aka Bucketing Yes Yes Possibly Yes Pacific Islander
Chromosome Browser Possibly Possibly Yes, once segments or ancestors identified Possibly Pacific Islander, possibly
Total Matches Yes, compared to non-endogamous No No No No, unknown
AutoClusters Yes Yes Uncertain, probably Yes Pacific Islander
Estimated Relationships High Not always Sometimes No Sometimes Uncertain, probably
Relationship Range High Possibly, sometimes Possibly Possibly Possibly Pacific Islander, possibly
More, Smaller Segments Yes Yes Probably Yes Pacific Islander, probably
Parents Related Some but minimal Possibly Uncertain Probably similar to Jewish Uncertain, Possibly
Surnames Probably Probably Probably Not Possibly Possibly
Locations Possibly Probably Probably Not Probably Probably Pacific Islander
Projects Probably Probably Possibly Possibly Probably Pacific Islander
Y DNA Yes, often Yes, often No Yes Pacific Islander
Mitochondrial DNA Yes, often Sometimes No Yes Pacific Islander
Y public tree Probably not alone No No Yes Pacific Islander
MtDNA public tree Probably not No No Yes Pacific Islander
Y DNA Discover Yes Possibly Probably not, maybe projects Yes Pacific Islander

Summary of Endogamy Tools to Assist People Seeking Unknown Parents and Grandparents

This table provides a summary of when each of the various tools can be useful to:

  • People seeking unknown close relatives
  • People who already know who their close relatives are, but are seeking additional information or clues about their genealogy

I considered rating these on a 1 to 10 scale, but the relative usefulness of these tools is dependent on many factors, so different tools will be more or less useful to different people.

For example, ethnicity is very useful if someone is admixed from different populations, or even 100% of a specific endogamous population. It’s less useful if the tester is 100% European, regardless of whether they are seeking close relatives or not. Conversely, even “vanilla” ethnicity can be used to rule out majority or recent admixture with many populations.

Tools Unknown Close Relative Seekers Known Close Relatives – Enhance Genealogy
Ethnicity Yes, to identify or rule out populations Yes
Ethnicity Painting Yes, possibly, depending on population Yes, possibly, depending on population
Ethnicity Phasing Yes, possibly, depending on population Yes, possibly, depending on population
DNA Communities Yes, possibly, depending on population Yes, possibly, depending on population
Genetic Groups Possibly, depending on population Possibly, depending on population
Family Matching aka Bucketing Not if parents are entirely unknown, but yes if one parent is known Yes
Chromosome Browser Unlikely Yes
AutoClusters Yes Yes, especially at MyHeritage if Jewish
Estimated Relationships High Not No
Relationship Range High Not reliably No
More, Smaller Segments Unlikely Unlikely other than confirmation
Match List Download Yes Yes
Surnames Yes Yes
Locations Yes Yes
Projects Yes Yes
Y DNA Yes, males only, direct paternal line, identifies surname lineage Yes, males only, direct paternal line, identifies and correctly places surname lineage
Mitochondrial DNA Yes, both sexes, direct matrilineal line only Yes, both sexes, direct matrilineal line only
Public Y Tree Yes for locations Yes for locations
Public Mito Tree Yes for locations Yes for locations
Discover Y DNA Yes, for heritage information Yes, for heritage information
Parents Related – ROH Possibly Less useful

Acknowledgments

A HUGE thank you to several people who contributed images and information in order to provide accurate and expanded information on the topic of endogamy. Many did not want to be mentioned by name, but you know who you are!!!

If you have information to add, please post in the comments.

_____________________________________________________________

Follow DNAexplain on Facebook, here or follow me on Twitter, here.

Share the Love!

You’re always welcome to forward articles or links to friends and share on social media.

If you haven’t already subscribed (it’s free,) you can receive an email whenever I publish by clicking the “follow” button on the main blog page, here.

You Can Help Keep This Blog Free

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Uploads

Genealogy Products and Services

My Book

Genealogy Books

Genealogy Research

How to Download Your DNA Match Lists & Segment Files

If you’ve taken an autosomal DNA test and you’re working to determine how your matches are related to you, meaning which ancestors you share, you’ll want to download your DNA match list.

There are three types of files that you can potentially download from each of the major autosomal DNA testing vendors.

Raw DNA file – If you want to upload your DNA file to another vendor for matching at their site (MyHeritage and FamilyTreeDNA,) you’ll need to download your raw data file from the vendor where you tested. I provided step-by-step instructions for this process at each of the vendors, here.

DNA Segment File – This file contains the segment information with each of your matches, including the start and end locations of your matching segment(s), the total number of matching (shared) centiMorgans (cM) above the vendor’s matching threshold, and sometimes the longest segment.

If you want to sort a spreadsheet to look for all of your matches on specific areas of chromosomes, this is the best way to achieve that goal. I use this information at DNAPainter when painting the segments of matches with whom I can identify a common ancestor.

You may be able to download filtered lists or individual match data as well, as opposed to an entire match list spreadsheet, but the methodology varies at each vendor.

Ancestry does not provide segment information at all. 23and Me combines this information with the next file.

Match List – This file will contain your list of matches along with other information about the matches which you will find genealogically helpful. I find using this file easier than viewing each match separately at the vendors when trying to obtain an overview or when searching for a particular surname in either my match list or their ancestral surnames.

I can also sort by haplogroup, for example, which can sometimes help immensely if that information is available.

Ancestry does not facilitate or allow downloading your match list. 23andMe combines this information with your matching DNA segments in one file.

Here’s a handy-dandy summary by testing vendor.

Vendor Raw DNA File DNA Segment File Match List
23andMe Yes, instructions here Yes, instructions in this article Yes, instructions in this article
FamilyTreeDNA Yes, instructions here Yes, instructions in this article Yes, instructions in this article
MyHeritage Yes, instructions here Yes, instructions in this article Yes, instructions in this article
Ancestry Yes, instructions here No, does not provide No, does not provide

I’ve written step-by-step instructions for how to download your Match List and DNA Segment file(s) at each vendor.

23andMe

Please note that 23and Me is the only vendor to limit your matches, which means you will only receive a file containing:

  • 1500 matches if you tested before the V5 chip, so before August 9, 2017, and have not established communications with matches that would have rolled off of your list otherwise. (I have 1805 matches, so have established contact with 305 that would otherwise have rolled off the end.)
  • 1500 if you tested on the V5 chip, so beginning August 9, 2017, but did not establish communications OR did not purchase the health option, OR did not purchase the yearly membership. If you established communications, those matches won’t roll off, and if you purchase the membership, the match threshold is raised. You may still need to establish contact to keep people from rolling off the larger list as well.
  • 5000-ish (23andMe doesn’t say exactly) if you tested on the V5 chip for BOTH ancestry and healthy AND purchased the yearly membership.

You will only receive match information for people who are listed on your restricted match list, not people who have rolled off as closer matches arrived. Therefore, I encourage you to retain your old match lists because some of your matches will be gone each time you download.

23andMe combines your match list with your segment file.

Sign on and select DNA Relatives on the toolbar.

Next, select “See all relatives.”

Scroll to the very bottom and click on Request DNA Relatives Data Download.

Your file will be prepared, and you’ll receive an email when the file is ready to be downloaded. Mine only took a minute or two, and I simply waited on my 23andMe page until the message appeared.

Save and open the downloaded file, and you’ll see a variety of information about each of your matches, in closest-match-first order, including:

  • Match name
  • Chromosome segment match information, including start and end locations, genetic distance (centiMorgans cMs,) and SNPs
  • Maternal and paternal sides if your parent or parents have tested
  • Number of matching segments
  • Relationship information
  • Birth year
  • Percent shared DNA
  • Haplogroups
  • Notes you’ve made
  • Family surnames
  • Family locations
  • 4 Grandparents’ birth country
  • Family Tree URL, external to 23andMe, if provided by tester

FamilyTreeDNA

At FamilyTreeDNA, your match list and segment information are contained in two separate files.

Sign on and click on Family Finder Matches under Autosomal DNA Results and Tools.

You’ll see your matches. At the top of your match list, on the right side, click on “Export CSV.”

You can select “All Matches” or “Filtered Matches.”

If you haven’t selected a filter, you won’t be able to make that selection. Generally, you want the entire match list.

Your match list will be prepared and downloaded.

You’ll find:

  • Match name
  • Relationship information
  • Shared DNA total
  • Longest segment
  • Linked relationship if you have linked that person to their profile card in your tree
  • Ancestral surnames
  • Haplogroups if tested
  • Notes you’ve made
  • Bucketing – Paternal, maternal, both, none
  • X-Match amount

Note – If you’re a male, valid X matches (meaning matches that are not identical by chance,) will always be on your maternal side because you received your Y chromosome from your father instead of a copy of his X. I wrote about X matching, here.

If your match is a male, an X match will always be through his mother’s line.

Segment information is available in a separate download on the chromosome browser page.

Under Autosomal DNA Results and Tools, click on the Chromosome Browser.

You’ll be able to select people to compare in the chromosome browser, but to download all of your matching segments to all of your matches, click on “Download All Segments.”

If you select people to compare your relationship, and then click on “Download Segments,” you’ll only be downloading the segments for the people you are comparing.

To download all of your segments, be sure the “All” is showing in the link and download before selecting anyone for comparison.

MyHeritage

MyHeritage also provides two separate files for matches and chromosome segment information.

Select DNA matches, then the 3-dot menu, then “Export DNA Matches.”

If you also want your individual segment information for your matches, also order the second file on that menu, “Export shared DNA segment info for shared DNA matches.”.

You’ll see a message that your report is being prepared and will be sent to the email address on file.

If your file doesn’t appear in your email box, check your spam folder.

Your match list provides:

  • Match name
  • Age
  • Country
  • Contact link
  • DNA managed by (if not the tester)
  • Contact link for DNA manager
  • Relationship information
  • Total cM
  • Percent of matching DNA
  • Number of matching segments
  • Largest segment
  • Has tree and tree manager
  • Number of people in their tree
  • Tree link and link to contact tree manager
  • Number of SmartMatches
  • Shared ancestral surnames
  • All ancestral surnames
  • Notes you’ve made
  • Has Theory of Family Relativity

Now that you have these files, what do you do with them?

Evaluating

Is there anything that stands out as remarkable, perhaps that you didn’t know or notice before? Patterns that might be informative?

I had a huge brick wall on my mother’s side that has since fallen, but retrospectively, had I reviewed these lists when that wall was still standing firm, there was a huge hint just waiting for me.

My mother has a very unexpected Acadian line through her great-grandfather, Anthony Lore, so 12.5% of her heritage.

On my match list, I see a large number of French surnames, but I didn’t know of any French ancestors on either side of my tree. Many surnames repeat, such as LeBlanc, d’Entremont (which is really unusual), Landry, and deForest. Why were these people on my match list? This is definitely smoke, and there must be fire someplace, but where?

Looking at the locations associated with these matches’ ancestors would have provided additional clues.

However, simply googling my great-grandfather’s surname in combination with those French surnames I listed above produced these 3 top search results.

Yes, you guessed it. Anthony turned out to be “Antoine” and Lore is spelled in a variety of ways, including Lord. His family is Acadian.

That’s Anthony Lore, which is how he was listed on the death certificate of his son, in the software on my computer, above, and here is Antoine Lore at WikiTree, below.

As you can see, that brick wall falling opened a whole new group of ancestors, and along with it, my appreciation of endogamy😊

Match lists facilitate viewing the big picture and can be a very useful tool for people seeking unknowns or trying to group people together in a variety of ways.

Do you have any brick walls that need to fall?

How can or do you utilize your match lists?

_____________________________________________________________

Follow DNAexplain on Facebook, here or follow me on Twitter, here.

Share the Love!

You’re always welcome to forward articles or links to friends and share on social media.

If you haven’t already subscribed (it’s free,) you can receive an email whenever I publish by clicking the “follow” button on the main blog page, here.

You Can Help Keep This Blog Free

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Uploads

Genealogy Products and Services

My Book

Genealogy Books

Genealogy Research

In Search of…Vendor Features, Strengths, and Testing Strategies

This is the third in our series of articles about searching for unknown close family members, specifically; parents, grandparents, or siblings. However, these same techniques can be applied to ancestors further back in time too.

In this article, we are going to discuss your goals and why testing or uploading to multiple vendors is advantageous – even if you could potentially solve the initial mystery at one vendor. Of course, the vendor you test with first might not be the vendor where the mystery will be solved, and data from multiple vendors might just be the combination you need.

Testing Strategy – You Might Get Lucky

I recommended in the first article that you go ahead and test at the different vendors.

Some people asked why, and specifically, why you wouldn’t just test at one vendor with the largest database first, then proceed to the others if you needed to.

That’s a great question, and I want to discuss the pros and cons in this article more specifically.

Clearly, that is one strategy, but the approach you select might differ based on a variety of considerations:

  • You may only be interested in obtaining the name of the person you are seeking – or – you may be interested in finding out as much as possible.
  • You may find that your best match at one company is decidedly unhelpful, and may even block you or your efforts, while someone elsewhere may be exactly the opposite.
  • Solving your mystery may be difficult and painful at one vendor, but the answer may be infinitely easier at a different vendor where the answer may literally be waiting.
  • There may not be enough, or the right information, or matches, at any one vendor, but the puzzle may be solvable by combining information from multiple vendors and tests. Every little bit helps.
  • You may have a sense of urgency, especially if you hope to meet the person and you’re searching for parents, siblings or grandparents who may be aging.
  • You may be cost-sensitive and cannot afford more than one test at a time. Fortunately, our upload strategy helps with that too. Also, watch for vendor sales or bundles.

From the time you order your DNA test, it will be about 6-8 weeks, give or take a week or two in either direction, before you receive results.

When those results arrive, you might get lucky, and the answer you seek is immediately evident with no additional work and just waiting for you at the first testing company.

If that’s the case, you got lucky and hit the jackpot. If you’re searching for both parents, that means you still have one parent to go.

Unidentified grandparents can be a little more difficult, because there are four of them to sort between.

If you discover a sibling or half-sibling, you still need to figure out who your common parent is. Sometimes X, Y, and mitochondrial DNA provides an immediate answer and is invaluable in these situations.

It’s more likely that you’ll find a group of somewhat more distant relatives. You may be able to figure out who your common grandparents or great-grandparents are, but not your parent(s) initially. Often, the closer generation or two is actually the most difficult because you’re dealing with contemporary records which are not publicly available, fewer descendants, and the topic may be very uncomfortable for some people. It’s also complicated because you’re often not dealing with “full” relationships, but “half,” as in half-sibling, half-niece, half-1C, etc.

You may spend a substantial amount of time trying to solve this puzzle at the first vendor before ordering your next test.

That second test will also take about 6-8 weeks, give or take. I recommend that you order the first two autosomal tests, now.

Order Your First Two Autosomal Tests

The two testing companies with the largest autosomal databases for comparison, Ancestry, and 23andMe, DO NOT accept DNA file uploads from other companies, so you’ll need to test with each individually.

Fortunately, you CAN transfer your autosomal DNA tests to both MyHeritage and FamilyTreeDNA, for free.

You will have different matches at each company. Some people will be far more responsive and helpful than others.

I recommend that you go ahead and order both the Ancestry and 23andMe tests initially, then upload the first one that comes back with results to both FamilyTreeDNA and MyHeritage. Complete, step-by-step download/upload instructions can be found here.

You can also upload your DNA file to a fifth company, Living DNA, but they are significantly smaller and heavily focused on England and Great Britain. However, if that’s where you’re searching, this might be where you find important matches.

You can also upload to GEDMatch, a popular third-party database, but since you’re going to be in the databases of the four major testing companies, there is little to be gained at GEDMatch in terms of people who have not tested at one of the major companies. Do NOT upload to GEDMatch INSTEAD of testing or uploading to the four major sites, as GEDMatch only has a small fraction of the testers in each of the vendor databases.

What GEDMatch does offer is a chromosome browser – something that Ancestry does NOT offer, along with other clustering tools which you may find useful. I recommend GEDMatch in addition to the others, if needed or desired.

Ordering Y and Mitochondrial DNA Tests

We reviewed the basics of the different kinds of DNA, here.

Some people have asked why, if autosomal DNA shows relatives on all of your lines, would one would want to order specific tests that focus on just one line?

It just so happens that the two lines that Y and mitochondrial DNA test ARE the two lines you’re seeking – direct maternal – your mother (and her mother), and direct paternal, your father (and his father.)

These two tests are different kinds of DNA tests, testing a different type of DNA, and provide very focused information, and matches, not available from autosomal DNA tests.

For men, Y DNA can reveal your father’s surname, which can be an invaluable clue in narrowing paternal candidates. Knowing that my brother’s Y DNA matched several men with the surname of Priest made me jump for joy when he matched a woman of that same last name at another vendor.

Here’s a quote from one of the members of a Y DNA project where I’m the volunteer administrator:

“Thank you for your help understanding and using all 4 kinds of my DNA results. By piecing the parts together, I identified my father. Specifically, without Y DNA testing, and the Big Y test, I would not have figured out my parental connection, and then that my paternal line had been assigned to the wrong family. STR testing gave me the correct surname, but the Big Y test showed me exactly where I fit, and disproved that other line. I’m now in touch with my father, and we both know who our relatives are – two things that would have never happened otherwise.”

If you fall into the category of, “I want to know everything I can now,” then order both Y and mitochondrial DNA tests initially, along with those two autosomal tests.

You will need to order Y (males only) and mitochondrial DNA tests separately from the autosomal Family Finder test, although you should order on the same account as your Family Finder test at FamilyTreeDNA.

If you take the Family Finder autosomal test at FamilyTreeDNA or upload your autosomal results from another vendor, you can simply select to add the Y and mitochondrial DNA tests to your account, and they will send you a swab kit.

Conversely, you can order either a Y or mitochondrial DNA test, and then add a Family Finder or upload a DNA file if you’ve already taken an autosomal DNA test to that account too. Note – these might not be current prices – check here for sales.

You will want all 3 of your tests on the same account so that you can use the Advanced Matches feature.

Using Advanced Matches, you’ll be able to view people who match you on combinations of multiple kinds of tests.

For example, if you’re a male, you can see if your Y DNA matches also match you on the Family Finder autosomal test, and if so, how closely?

Here’s an example.

In this case, I requested matches to men with 111 markers who also match the tester on the Family Finder test. I discovered both a father and a full sibling, plus a few more distant matches. There were ten total combined matches to work with, but I’ve only shown five for illustration purposes.

This information is worth its weight in gold.

Is the Big Y Test Worth It?

People ask if the Big Y test is really worth the extra money.

The answer is, “it depends.”

If all you’re looking for are matching surnames, then the answer is probably no. A 37 or 111 marker test will probably suffice. Eventually, you’ll probably want to do the Big Y, though.

If you’re looking for exact placement on the tree, with an estimated distance to other men who have taken that test, then the answer is, “absolutely.” I wish the Big Y test had been available back when I was hunting for my brother’s biological family.

The Big Y test provides a VERY specific haplogroup and places you very accurately in your location on the Y DNA tree, along with other men of your line, assuming they have tested. You may find the surname, as well as being placed within a generation or a few of current in that family line.

Additionally, the Discover page provides estimates of how far in the past you share a common ancestor with other people that share the same haplogroup. This can be a HUGE boon to a male trying to figure out his surname line and how closely in time he’s related to his matches.

Big Y NPE Examples

Y DNA SNP mutations tested with the Big Y test accrue a mutation about every generation, or so. Sometimes we see mutations in every generation.

Here’s an example from my Campbell line. Haplogroups are listed in the top three rows.

I created this spreadsheet, but FamilyTreeDNA provides a block tree for Big Y testers. I’ve added the genealogy of the testers, with the various Big Y testers at the bottom and common ancestors above, in bold.

We have two red NPE lines showing. The MacFarlane tester matches M. Campbell VERY closely, and two Clark males match W. Campbell and other Campbells quite closely. We utilized autosomal plus the Y results to determine where the unknown parentage events occurred. Today, if you’re a Clark or MacFarlane male, or a male by any other surname who was fathered by a Y chromosome Campbell male (by any surname), you’ll know exactly where you fit in this group of testers on your direct paternal line.

Y DNA is important because men often match other men with the same surname, which is a HUGE clue, especially in combination with autosomal DNA results. I say “often,” because it’s possible that no one in your line has tested, or that your father’s surname is not his biological surname either.

Y and mitochondrial DNA matches can be HUGELY beneficial pieces of information either by confirming a close autosomal relationship on that line, or eliminating the possibility.

Lineage-Specific Population Information

In addition to matching other people, both Y and mitochondrial DNA tests provide you with lineage-specific population or “ethnicity” information for this specific line which helps you focus your research.

For example, if you view the Y DNA Haplogroup Origins shown for this tester, you’ll discover that these matches are Jewish.

The tester might not be Jewish on any other genealogical line, but they definitely have Jewish ancestry on their Y DNA, paternal, line.

The same holds true for mitochondrial DNA as well. The main difference with mitochondrial DNA is that the surname changes with each generation, haplogroups today (pre-Million Mito) are less specific, and fewer people have been tested.

Y and Mitochondrial DNA Benefits

Knowing your Y and mitochondrial DNA haplogroups not only arm you with information about yourself, they provide you with matching tools and an avenue to include or exclude people as your direct line paternal or maternal ancestors.

Your Y and mitochondrial DNA can also provide CRITICALLY IMPORTANT information about whether that direct line ancestor belonged to an endogamous population, and where they came from.

For example, both Jewish and Native populations are endogamous populations, meaning highly intermarried for many generations into the past.

Knowing that helps you adjust your autosomal relationship analysis.

Why Order Multiple Tests Initially Instead of Waiting?

If you’ve been adding elapsed time, two autosomal tests (Ancestry and 23andMe), two uploads (to FamilyTreeDNA and MyHeritage,) a Y DNA test, and a mitochondrial DNA test, if all purchased serially, one following the other, means you’ll be waiting approximately 6-8 months.

Do you want to wait 6-8 months for all of your results? Can you afford to?

Part of this answer has to do with what, exactly, you’re seeking, and how patient you are.

Only you can answer that question.

A Name or Information?

Are you seeking the name or identity of a person, or are you seeking information about that person?

Most people don’t just want to put a name to the person they are seeking – they want to learn about them and the rest of the family that door opens.

You will have different matches at each company. Even after you identify the person you seek, the people you match may have trees you can view, with family photos and other important information. (Remember, you can’t see living people in trees.) Your matches may have first-person information about your relative and may know them if they are living, or have known them.

Furthermore, you may have the opportunity to meet that person. Time delayed may not be able to be recovered or regained.

One cousin that I assisted discovered that his father had died just six weeks before he broke through that wall and made the connection.

Working with data from all vendors simultaneously will allow you to combine that data and utilize it together. Using your “best” matches at each company, augmented by X, Y, and/or mitochondrial DNA, can make MUCH shorter work of this search.

Your closest autosomal matches are the most important and insightful. In this series, I will be working with the top 15 autosomal results at each vendor, at least initially. This approach provides me with the best chance of meaningful close relationship discoveries.

Data and Vendor Results Integration

Here’s a table of my two closest maternal and paternal matches at the four major vendors. I can assign these to maternal or paternal sides, because I know the identity of my parents, and I know some of these people. If an adoptee was doing this, the top 4 could all be from one parent, which is why we work with the top 15 or so matches.

Vendor Closest Maternal Closest Paternal Comments
Ancestry 1C, 1C1R Half-1C, 2C I recognized both of the maternal and neither of the paternal.
23andMe 2C, 2C 1C1R, half-gr-niece Recognized both maternal, one paternal
MyHeritage Mother uploaded, 1C Half-niece, half-1C Recognized both maternal, one paternal
FamilyTreeDNA Mother tested, 1C1R Parent/child, half-gr-niece uploaded Recognized all 4

To be clear, I tested my mother’s mitochondrial DNA before she passed away, but because FamilyTreeDNA archives DNA samples for 25 years, as the owner/manager of her DNA kit, I was able to order the Family Finder test after she had passed away. Her tests are invaluable today.

Then, years later, I uploaded her results to MyHeritage.

If I was an adopted child searching for my mother, I would find her results in both databases today. She’ll never be at either 23andMe or Ancestry because she passed away before she could test there and they don’t accept uploads.

Looking at the other vendors, my half-niece at MyHeritage is my paternal half-sibling’s daughter. My half-sibling is deceased, so this is as close as I’ll ever get to matching her.

At 23andMe, the half-great-niece is my half-siblings grandchild.

It’s interesting that I have no matches to descendants of my other half-sibling, who is also deceased. Maybe I should ask if any of his children or grandchildren have tested. Hmmmm…..

You can see that I stand a MUCH BETTER chance of figuring out close relatives using the combined closest matches of all four databases instead of the top matches from just one database. It doesn’t matter if the database is large if the right person or people didn’t test there.

Combine Resources

I’ll be providing analysis methodologies for working with results from all of the vendors together, just in case your answer is not immediately obvious. Taking multiple DNA tests facilitates using all of these tools immediately, not months later. Solving the puzzle sooner means you may not miss valuable opportunities.

You may also discover that the door slams shut with some people, or they may not respond to your queries, but another match may be unbelievably helpful. Don’t limit your possibilities.

Let’s take a look at the strengths of each vendor.

Vendor Strengths and Things to Know

Every vendor has product strengths and idiosyncracies that the others do not. All vendors provide matches and shared matches. Each vendor provides ethnicity tools which certainly can be useful, but the features differ and will be covered elsewhere.

  • AncestryAncestry has the largest autosomal database and includes ThruLines, but no Y or mitochondrial DNA testing, no clusters, no chromosome browser, no triangulation, and no X chromosome matching or reporting. Ancestry provides genealogical records, advanced tools, and full tree access to your matches’ trees with an Ancestry subscription. Ancestry does not allow downloading your match list or segment match information, but the other vendors do.
  • 23andMe 23andMe has the second largest database. They provide triangulation and genetic trees that include your closest matches. Many people test at 23andMe for health and wellness information, so 23andMe has people in their database who are not specifically interested in genealogy and probably won’t have tested elsewhere, but may be invaluable to your search. 23andMe provides Y and mtDNA high-level haplogroups only, but no matching or other haplogroup information. If you purchase a new test or have a V5 ancestry+health current test, you can expand your matches from a limit of 1500 to about 5000 with an annual membership. For seeking close relatives, you don’t need those features, but you may want them for genealogy. 23andMe is the only vendor that limits their customers’ matches.
  • MyHeritageMyHeritage has the third largest database that includes lots of European testers. MyHeritage provides triangulation, Theories of Family Relativity, and an integrated cluster tool* but does not report X matches and does not offer Y or mitochondrial DNA testing. MyHeritage accepts autosomal DNA file uploads from other testing companies for free and provides access to advanced DNA features for a one-time unlock fee. MyHeritage includes genealogical records and full feature access to advanced DNA tools with a Complete Subscription. (Free 15 days trial subscription, here.)
  • FamilyTreeDNA Family Finder (autosomal)FamilyTreeDNA is the oldest DNA testing company, meaning their database includes people who initially tested 20+ years ago and have since passed away. This, in essence, gets you one generation further back in time, with the possibility of stronger matches. Their Family Matching feature buckets and triangulates your matches, assigning them to your maternal or paternal sides if you link known matches to their proper place in your tree, even if your parents have not tested. FamilyTreeDNA accepts uploads from other testing companies for free and provides advanced DNA features for a one time unlock fee.
  • FamilyTreeDNAFamilyTreeDNA is the only company that offers both Y and mitochondrial DNA testing products that include matching, integration with autosomal test results, and other tools. These two tests are lineage-specific and don’t have to be sorted from your other ancestral lines.

I wrote about using Y DNA results, here.

I wrote about using mitochondrial DNA results, here.

*Third parties such as Genetic Affairs provide clustering tools for both 23andMe and FamilyTreeDNA. Clustering is integrated at MyHeritage. Ancestry does not provide a tool for nor allow third-party clustering. If the answer you seek isn’t immediately evident, Genetic Affairs clustering tools group people together who are related to each other, and you, and create both genetic and genealogical trees based on shared matches. You can read more about their tools, here.

Fish in all the Ponds and Use All the Bait Possible

Here’s the testing and upload strategy I recommend, based on the above discussion and considerations. The bottom line is this – if you want as much information as possible, as quickly as possible, order the four tests in red initially. Then transfer the first autosomal test results you receive to the two companies identified in blue. Optionally, GEDMatch may have tools you want to work with, but they aren’t a testing company.

What When Ancestry 23andMe MyHeritage FamilyTreeDNA
Order autosomal Initially X X    
Order Y 111 or Big-Y DNA test if male Initially       X
Order mitochondrial DNA test Initially if desired       X
Upload free autosomal When Ancestry or 23andMe results are available     X X
Unlock Advanced Tools When you upload     $29 $19
Optional GEDMatch free upload If desired, can subscribe for advanced tools

When you upload an autosomal DNA file to a vendor site, only upload one file per site, per tester. Otherwise, multiple tests simply glom up everyone’s match list with multiple matches to the same person.

Multiple vendor sites will hopefully provide multiple close matches, which increase your opportunity to discover INFORMATION about your family, not just the identity of the person you seek.

Or maybe you prefer to wait and order these DNA tests serially, waiting until one set of results is back and you’re finished working with them before ordering the next one. If so, that means you’re a MUCH more patient person than me. 😊

Our next article in this series will be about endogamy, how to know if it applies to you, and what that means to your search.

_____________________________________________________________

Follow DNAexplain on Facebook, here or follow me on Twitter, here.

Share the Love!

You’re always welcome to forward articles or links to friends and share on social media.

If you haven’t already subscribed (it’s free,) you can receive an email whenever I publish by clicking the “follow” button on the main blog page, here.

You Can Help Keep This Blog Free

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Uploads

Genealogy Products and Services

My Book

Genealogy Books

Genealogy Research

DNA: In Search of…What Do You Mean I’m Not Related to My Family? – and What Comes Next?

Welcome to the second in our series of articles about how to search for unknown family members.

I introduced the series in the article, DNA: In Search of…New Series Launches.

This article addresses the question of “How did this happen?” and introduces the tools we need to answer that question. I’ve combined two articles into one because I really didn’t want to leave you hanging after introducing you to the problem.

We discuss the various kinds of DNA tests, when they are appropriate for your biological sex, and how one can use them to discover information about the person or people you’re seeking.

In other words, we begin at the point of making the discovery that there is something amiss, then review possible glitches. Once we confirm there is someone you need to search for, we discuss how to use genetic testing reasonably and in a planned fashion to solve that mystery.

Please note that I am NOT referring to unexpected ethnicity results in this article. This article refers to your match list and who you do and don’t match on that list. We will discuss ethnicity and how it can help you in a different context in a future article.

The Unknown

Some people have known all their lives that they were adopted, or that they didn’t know the identity of one parent, generally their father.

Other people have made or will make that discovery in a different way. Sometimes, that realization happens when they take an autosomal DNA test and don’t match people they expect to match, either not at all or in a different way.

For example:

  • You might not match a parent or a sibling.
  • You could match only people on your mother’s side, but no known relatives on your father’s side.
  • Your parents or siblings have tested, but you don’t match any of them.
  • Your immediate family hasn’t tested, but your first and second cousins have tested, and you don’t match any of them.
  • You recognize no people, families, or family names on your match list.
  • You think you know your genealogy, but nothing on your match list looks familiar.
  • If your parents and close relatives haven’t tested, not recognizing families might be explained if your family is part of a community of undertested individuals.
  • You might not recognize anyone or surnames if you know absolutely nothing about your family genealogy.
  • Sometimes, a sibling is reported as a half-sibling instead of a full sibling, which is an unexpected finding. This means that you only share one parent, not two. I wrote about this in the article Full or Half Siblings. The non-matching parent is generally the father. The question that follows is, which one of you, if not both, weren’t fathered by the man you thought was your biological father?

These discoveries are generally unexpected and unwelcome – a horrible shock followed by some level of disbelief.

I’ve been there.

My half-brother turned out to not be my half-brother, so we weren’t biologically related at all, although that didn’t change how much I loved him one iota.

Later, I did identify his father, but it was too late for them. My brother had passed on by that time.

Ironically, his biological family would have welcomed him with open arms.

If you’re interested, I wrote about our journey in a series of articles:

The Shock of Discovery

It’s difficult discovering that your full sibling isn’t a full sibling or not a sibling at all, but it’s even worse when you discover that one or both of your parents are not your biological parent(s) when you weren’t expecting that. Obviously, sometimes those two shockers accompany each other.

And no, if you don’t match your parents, siblings, first or second cousins, DNA tests can’t be “that” wrong in terms of matching. That’s generally the first question everyone asks.

Yes, we have seen a couple of instances of test mix-ups at the labs, many years ago, among the millions of tests taken. Better quality control procedures were introduced, and a mix-up hasn’t happened in a very long time. However, if you really think that’s a possibility, or you need peace of mind – order another test from the same vendor. If the second test comes back with the same match list as the first test, there is no lab mix-up.

Or, you can order a test from another vendor – something you’re going to need anyway to solve the mystery and for your genealogy. Hint – the two vendors you must test at directly are Ancestry and 23andMe because they don’t accept uploads. If you’re going to order another test, make it one or both of those.

Before deciding you’ve discovered a genetic disconnect, let’s take a deep breath and look at a couple of other possibilities first.

Be Sure the Vial or Transfer Wasn’t Confused

If you’re encountering a situation where you’re not matching relatives that you know have tested, or for some reason, you suspect something isn’t right, the first things that need to be considered are:

  • Are you positive that your relative(s) have taken a DNA test? You wouldn’t believe how many times someone has told me that they don’t match their mother/father/sibling and come to find out, their family member hasn’t tested. Did they order a test but never send it in? Did they send it in, but their results arent’ back yet?
  • Are you positive that your relative(s) tested at the same company where you did? Many times we discover that they’ve tested, but at a different company. Have your relative show you their results, take a screenshot, or give you their login to confirm you’re at the same vendor.
  • Are you missing all of your relatives or just one or two in the same line? If the answer is one or two, they, not you, may have a disconnect, especially if you match other people on the same side of your family.
  • Did you and a friend or spouse both swab or spit at the same time? If so, is there any possibility that your and their vials were inadvertently swapped when you put them in envelopes and mailed them?

If there is any doubt, check with that other person and see if they are experiencing the same issue. If you look at their results, you may recognize your own family. I’ve seen this occur at family reunions and at the holidays, where several DNA tests were taken by various family members.

  • This last situation is much more common and is caused by confusing files during a download/upload to another vendor. Do you manage multiple kits, and did you inadvertently download the wrong DNA file, or upload the wrong person’s DNA file to a different vendor?

If so, you’re looking at someone else’s results, thinking they are your own. If that person is a cousin, you may be even more confused because you may match some of the same people, just at very different levels. This could make your sibling look like a half-sibling or first cousin, for example.

If there is any possibility of an upload mix-up, or any doubt whatsoever:

  1. Delete the suspect file at the vendor where you uploaded the DNA file
  2. Delete the downloaded files from your computer
  3. Start over by downloading the DNA file again from the original vendor
  4. Label the downloaded file clearly, and immediately, with the tester’s name and date.
  5. Upload the new file to the target vendor before you download another person’s DNA file.

Step-by-step upload/download instructions can be found, here.

Not Parent Expected

If you discover that one of two parents is not the expected biological parent, you’ve discovered a genetic disconnect that is known by a number of different terms. Initially, the term NPE was used, but other terms have been added over the years, and they are sometimes used differently, depending on who is speaking.

  • NPE – Non-Parental Event, Not Parent Expected
  • MPE – Misattributed Paternal/Parental Event or Misattributed Parentage Experience
  • Undocumented Adoption – Regardless of how the situation occurred, it was not documented.

Please, please do NOT jump to conclusions and make assumptions about infidelity and duplicity. There can be many reasons for this occurrence, including:

  • Agreed upon “open” relationships
  • Intentional impregnation when one partner is infertile
  • Surrogacy
  • Infidelity
  • Rape
  • Sperm donor
  • Adoption
  • Unknown first marriage, with step-father raising a child as his own
  • Illegitimate birth of a child before marriage
  • Lifestyle choices
  • Intoxication
  • Coercion

In other words, the situation may have been known to the involved parties, even if they did not share that information with you or others. Prior to the last 20 years, no one would ever have considered that this information might ever be revealed. Social norms and judgments were very different a generation or more ago.

I wrote about this in the article, Things That Need To Be said: Adoption, Adultery, Coercion, Rape, and DNA.

Of course, these events could happen in any generation, but the closer to you, in time, the more evident it will be when looking at your matches.

Now that we’ve determined that we have an unknown parent or grandparent, how do we sort this out?

Let’s Start with the Basics

I’m going to begin by explaining the basics of the different kinds of tests, and when each test can be used.

In this series, we will be focused on searching for six individuals, separately – both parents and all four grandparents.

You will be able to use the same techniques for ancestors in more distant generations by following the same instructions and methodologies, just adapting to include more matches to reach further back in time.

We will be taking the search step-by-step in each article.

Four Kinds of DNA

For genealogy, we can work with four kinds of DNA:

We can potentially use each of these when searching for unknown ancestors, including parents and grandparents. Each type of DNA has specific characteristics and uses in different situations because it’s inherited differently by the son and daughter, below.

In these examples, everything is from the perspective of the son and daughter.

Y DNA testing is only available to males, because only males have a Y chromosome which is inherited directly from the father, shown by the blue arrow. In other words, the son has the father’s Y chromosome (and generally his surname,) but the daughter does not.

The Y chromosome can provide surnames and very close matches, or reach far back in time, or both. Ideally, Y DNA is used in conjunction with autosomal testing when searching for unknown individuals.

Mitochondrial DNA can be tested by everyone since males and females both receive mitochondrial DNA from their mother, passed to her from her direct maternal line, shown by the pink arrows and the yellow hearts. Both the son and daughter can test for their mother’s mitochondrial DNA.

Both Y DNA and mitochondrial DNA can reach far back in time, but can also be informative of recent connections. Neither are ever mixed with the DNA of the other parent, so the DNA is not diluted over the generations.

Think of Y DNA and mitochondrial DNA as having the ability to provide recent genealogy information and connections, plus a deep dive on just one particular line. Fortunately, when you’re looking for parents, the lines they test are the direct maternal (or matrilineal) line and the direct paternal (or patrilineal) lines.

Both Y DNA and mitochondrial DNA tests are deep, not broad. One line each.

Y DNA and mitochondrial DNA will both be able to tell you if that specific ancestral line is European, African, Native American, Asian, Jewish, and so forth. Additionally, both offer matching at FamilyTreeDNA, information about where other testers’ ancestors are found in the world, and more.

If you want more information about what these tests have to offer, now, I provide a Y DNA Resource page, here, and a Mitochondrial DNA Resource page, here.

Autosomal DNA is the DNA contributed to you on chromosomes 1-22 by your ancestors from across all your ancestral lines in your tree, shown by the green arrow.

Everyone receives half of their autosomal DNA from each parent, with the exception of the X chromosome, which we’ll discuss in a minute.

This means that because the parent’s DNA is cut in half in each generation, the contributions of more distant ancestors’ DNA are reduced over time, with each generational division, until it’s no longer discernable or disappears altogether.

Autosomal DNA is broad across many lines, but not deep.

This figure provided by Dr. Paul Maier at FamilyTreeDNA, in the MyOrigins 3.0 White Paper, illustrates that by the 7th generation, you won’t receive DNA from a few of your ancestors. Some may be contained in segments too small to be reported by DNA testing vendors.

Translated, this means that autosomal DNA matching is most reliable in the closest generations, which is where we are working.

There is no documented occurrence of second cousins who don’t match each other. 90% of third cousins match, and about 50% of fourth cousins. I wrote about that in the article, Why Don’t I Match My Cousin?

The 23rd Chromosome – Sex Determination

Autosomal DNA generally refers to chromosomes 1-22. The 23rd chromosome is the sex selection chromosome.

Males have a Y chromosome contributed by their father, and an X contributed by their mother. The Y chromosome is what makes males, male.

Females have an X chromosome contributed by both their mother and father, which recombines just like chromosomes 1-22, but women have no Y chromosome.

In this graphic, you can see that a male child receives the father’s Y chromosome and the mother’s X. The female child receives an X chromosome from both parents.

Only FamilyTreeDNA and 23andMe report X chromosome results by including them with their autosomal DNA test.

Let’s take a look at how the X chromosome works in a little more detail.

X Chromosome DNA is another type of autosomal DNA, meaning it can be inherited from both parents in some circumstances. However, the X chromosome has a different inheritance path which means we analyze it differently for genealogy.

The father gives an X or a Y chromosome to his offspring, but not both.

If the child inherits the Y chromosome from the father, the child becomes a male. If the child inherits the X chromosome from the father, the child becomes a female.

Men only receive an X chromosome from their mother since they receive a Y chromosome from their father. Men can inherit a mixture of their mother’s X chromosomes that were contributed to their mother from both her mother (peach) and father (green.) Conversely, men can inherit their maternal grandmother’s or maternal grandfather’s X chromosome intact.

In this example, the mother and father have three sons. None of the sons can inherit an X chromosome from their father, whose X chromosome is shown in yellow. The father gives the sons his Y chromosome, not shown here, instead of an X, which is how they become males. Males only inherit their X chromosome from their mother.

The mother inherited one copy of her X chromosome from her father, shown in green, and one copy from her mother, shown in peach.

  1. The first son inherited his maternal grandfather’s green X chromosome, intact, from his mother, and none of his maternal grandmother’s peach X chromosome.
  2. The second son inherited a portion of his maternal grandmother’s peach X chromosome and a portion of his maternal grandfather’s green X chromosome. I’ve shown the portions as half, but the division could vary.
  3. The third son inherited his maternal grandmother’s peach X chromosome, intact, and none of his maternal grandfather’s green X chromosome.

This means if you match a man on his X chromosome, assuming it’s a valid match and not identical by chance, that match MUST come from his mother’s line.

In a future article, I’ll provide some X-specific fan charts and tips to help you easily discern potential X inheritance paths.

Women inherit an X chromosome from both their mother and father. They inherit their father’s X chromosome intact that he received from his mother, because he only has one X to give his daughter. Therefore, daughters inherit their paternal grandmother’s X chromosome from their father, because he passes on exactly what he received from his mother.

In this graphic, the father and mother have three daughters. You can see that each daughter receives the father’s yellow X chromosome that he inherited from his mother.

He doesn’t have a second copy of an X chromosome to mix with his mother’s.

Women inherit their mother’s X chromosome in the same fashion that men do. You can see in our example that:

  • The first daughter inherited her father’s yellow X chromosome, plus her maternal grandmother’s peach X chromosome, intact, and none of her maternal grandfather’s green X chromosome.
  • The second daughter inherited her father’s yellow X chromosome, plus part of her maternal grandfather’s green X chromosome and part of her maternal grandmother’s peach X chromosome from her mother. The portions of the mother’s pink and green chromosomes inherited by the daughter can vary widely.
  • The third daughter inherited her father’s yellow X chromosome, plus her maternal grandfather’s green X chromosome, intact, which is his mother’s X chromosome, of course. This daughter inherited none of her maternal grandmother’s peach X chromosome.

Women inherit two X chromosomes, one from each parent, while men only inherit one X, contributed from their mother. This means that X matches have different inheritance paths for women and men.

Because the X inheritance path involves the mother, many people confuse mitochondrial DNA inheritance with X inheritance. I wrote about that in the article, X Matching and Mitochondrial DNA is NOT the Same Thing.

Testing Strategies and Vendor Strengths

In the next article, we will be discussing detailed testing strategies based on multiple factors:

  • Who you are searching for in your tree
  • Who, other than you, is available to test
  • Sex of the tester(s)
  • Vendor strengths and unique offerings
  • Urgency, or not
  • Using combinations of vendor results and why you want to

Getting lucky may be what you hope for, but it’s not a strategy.😊

_____________________________________________________________

Follow DNAexplain on Facebook, here or follow me on Twitter, here.

Share the Love!

You’re always welcome to forward articles or links to friends and share on social media.

If you haven’t already subscribed (it’s free,) you can receive an email whenever I publish by clicking the “follow” button on the main blog page, here.

You Can Help Keep This Blog Free

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Uploads

Genealogy Products and Services

My Book

Genealogy Books

Genealogy Research

DNA-eXplained Celebrates Tenth Anniversary!

This blog, DNA-eXplained, is celebrating its 10th anniversary today. How time flies!

I never thought for a minute about a 10th anniversary when I launched that first article.

I started blogging to teach people and literally “explain” about genetic genealogy – which is why I selected the name DNA-eXplained. Over time, it has also been nicknamed DNAeXplain, which is fine.

I hoped to be able to answer questions once, with graphics and examples, instead of over and over again off-the-cuff. I needed someplace where people could be referred for answers. Blogging seemed like the perfect medium for achieving exactly that.

Blogs allow writers to publish content attractively and react to changes and announcements quickly.

Blogs encourage readers to subscribe for email delivery or use RSS reader aggregation and can publish to social media.

Content can be located easily using browser searches.

Everything, all content, is indexed and searchable by keyword or phrase.

Blogging certainly seemed like the right solution. Still, I was hesitant.

I vividly remember working at my desk that day, a different desk in a different location, and anguishing before pressing the “publish” button that first time. Was I really, REALLY sure? I had the sense that I was sitting in one of those life-defining fork-in-the-road moments and once embarked upon, there would be no turning back.

I’m so glad I closed my eyes and pushed that button!

I knew we were going to be in for an incredible journey. Of course, I had no idea where that roller coaster ride was going, but we would be riding together, regardless. What a journey it has been!

A decade later, I’ve had the opportunity to meet and become friends with so many of you, both online and in person. I’ve met countless cousins I never knew I had, thanks to various blog articles, including the 52 Ancestors series which has turned out to be 365 and counting.

I am incredibly grateful for this opportunity! I thought I was giving to others, yet I’ve been greatly enriched by this experience and all of you.

So much has changed in all of our lives.

Looking Back

Today, as I look back at that very short first article, I can’t help but think just how unbelievably far we’ve come.

There was one Y and mitochondrial DNA testing vendor in 2012, FamilyTreeDNA, and that’s still the case today.

There were three autosomal testing companies, 23andMe, FamilyTreeDNA, and Ancestry, in addition to the Genographic Project, which was sunset in 2019 after an amazing 15-year run. GEDmatch was two years old in 2012 and had been formed to fill the need for advanced autosomal matching tools. In 2016, MyHeritage joined the autosomal testing market. All of those companies have since been acquired.

In 2012, FamilyTreeDNA broke ground by accepting uploaded DNA files from other vendors. Autosomal DNA tests cost about $300 although prices were dropping. I don’t anticipate prices dropping much further now, because companies have to maintain a reasonable profit margin to stay in business.

In 2013, when DNA-eXplained celebrated its first anniversary, I had published 162 articles.

That first year was VERY busy with lots of innovation occurring in the industry. You can read my end-of-year article, 2012 Top 10 Genetic Genealogy Happenings if you’d like to reminisce a bit. For comparison, here’s my Genetic Genealogy at 20 Years summary.

The World is Our Oyster

In the past decade, I’ve penned articles in a wide variety of locations, in several countries, on 5 continents.

I’ve written in my offices, of course, but also in cars, on buses, trains, and planes. I’ve crafted several articles on ships while cruising. In fact, writing is one of my favorite “sea-day” things to do, often sitting on deck if it’s a nice day.

I’ve written in cemeteries, which shouldn’t surprise you, on the hood of my car, and cross-legged on the floor at innumerable conferences.

I’ve composed at picnic tables and in countless hotel lobbies, libraries, laboratories, restaurants, and coffee shops. And, in at least 3 castles.

I’ve written while on archaeology digs, balancing my laptop on my knees while sitting on an inverted bucket, trying to keep dirt, sand, and ever-present insects away.

I’ve even written in hospitals, both as a visitor and a patient. Yea, I might not have told you about that.

I’ve pretty much taken you with me everyplace I’ve gone for the past decade. And we are no place near finished!

Today

This article is number 1531 which means I’ve published an article every 2.3 days for a decade. Truthfully, I’m stunned. I had no idea that I have been that prolific. I never have writer’s block. In fact, I have the opposite problem. So many wonderful topics to write about and never enough time.

A huge, HUGE thank you to all of my readers. Writers don’t write if people don’t read!

DNA-eXplained has received millions and millions of views and is very popular, thanks to all of you.

There have been more than 48,000 comments, 4,800 a year or about 13 each day, and yes, I read every single one before approving it for publication.

Akismet, my spam blocker only reports for 45 months, but in that time alone, there have been about 100,000 attempted SPAM comments. That equates to about 75 each day and THANK GOODNESS I don’t have to deal with those.

WordPress doesn’t count “pages,” as such, but if my articles average 10 pages each, and each page averages 500 words, then we’re looking at someplace between 7 and 8 million words. That’s 13 times the size of War and Peace😊. Not only do I write each article, but I proofread it several times too.

Peering Into the Future

Genetic genealogy as a whole continues to produce the unexpected and solve mysteries.

Tools like triangulation in general, Family Matching at FamilyTreeDNA, genetic trees at 23andMe, Theories of Family Relativity at MyHeritage, and ThruLines at Ancestry have provided hints and tools to both suggest and confirm relationships and break through brick walls.

Ethnicity chromosome painting at both 23andMe and FamilyTreeDNA help unravel ancestral mysteries, especially for people with combinations of fundamentally different ancestries, as does Genetic Communities at Ancestry and Genetic Groups at MyHeritage.

Third-party tools that we love today weren’t even a twinkle in a developer’s eye in 2012. Products like DNAPainter, Genetic Affairs, and DNAGedcom pick up where the vendors leave off and are widely utilized by genealogists.

I hope that all of our vendors continue to invest in product development and provide the genetic genealogy community with new and innovative tools that assist us with breaking down those pesky brick walls.

Primarily, though, I hope you continue to enjoy your genealogy journey and make steady progress, with a rocket boost from genetic testing.

The vendors can provide wonderful tools, but it’s up to us to use them consistently, wringing out every possible drop. Don’t neglect paternal (male surname) Y DNA and matrilineal mitochondrial DNA testing for people who carry those important lines for your ancestors. All 4 kinds of DNA have a very specific and unique genealogical use.

I encourage you to test every relative you can and check their and your results often. New people test every single day. You never know where that critical piece of information will come from, or when that essential puzzle piece will drop into place.

Be sure to upload to both FamilyTreeDNA and MyHeritage (plus GEDMatch) so you are in the database of all the vendors. (Instructions here.) Fate favors the prepared.

Thank You!!

Thank you from the bottom of my heart for supporting me by reading and sharing my articles with your friends, organizations, and family members, by purchasing through the affiliate links, by buying my book, and by graciously sharing your own experiences.

Thank you for your suggestions and questions which plant the seeds of new articles and improvements.

I hope you’ve made progress with your research, unraveled some thorny knots, and that you’ve enjoyed this decade as much as I have. Tell me in the comments what you enjoyed the most or found most useful?

Here’s to another wonderful 10 years together!

___________________________________________________________

Follow DNAexplain on Facebook, here or follow me on Twitter, here.

Share the Love!

You’re always welcome to forward articles or links to friends and share on social media.

If you haven’t already subscribed (it’s free,) you can receive an email whenever I publish by clicking the “follow” button on the main blog page, here.

You Can Help Keep This Blog Free

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Uploads

Genealogy Products and Services

My Book

Genealogy Books

Genealogy Research

 

FamilyTreeDNA DISCOVER™ Launches – Including Y DNA Haplogroup Ages

FamilyTreeDNA just released an amazing new group of public Y DNA tools.

Yes, a group of tools – not just one.

The new Discover tools, which you can access here, aren’t just for people who have tested at FamilyTreeDNA . You don’t need an account and it’s free for everyone. All you need is a Y DNA haplogroup – from any source.

I’m going to introduce each tool briefly because you’re going to want to run right over and try Discover for yourself. In fact, you might follow along with this article.

Y DNA Haplogroup Aging

The new Discover page provides seven beta tools, including Y DNA haplogroup aging.

Haplogroup aging is THE single most requested feature – and it’s here!

Discover also scales for mobile devices.

Free Beta Tool

Beta means that FamilyTreeDNA is seeking your feedback to determine which of these tools will be incorporated into their regular product, so expect a survey.

If you’d like changes or something additional, please let FamilyTreeDNA know via the survey, their support line, email or Chat function.

OK, let’s get started!

Enter Your Haplogroup

Enter your Y DNA haplogroup, or the haplogroup you’re interested in viewing.

If you’re a male who has tested with FamilyTreeDNA , sign on to your home page and locate your haplogroup badge at the lower right corner.

If you’re a female, you may be able to test a male relative or find a haplogroup relevant to your genealogy by visiting your surname group project page to locate the haplogroup for your ancestor.

I’ll use one of my genealogy lines as an example.

In this case, several Y DNA testers appear under my ancestor, James Crumley, in the Crumley DNA project.

Within this group of testers, we have two different Big Y haplogroups, and several estimated haplogroups from testers who have not upgraded to the Big Y.

If you’re a male who has tested at either 23andMe or LivingDNA, you can enter your Y DNA haplogroup from that source as well. Those vendors provide high-level haplogroups.

The great thing about the new Discover tool is that no matter what haplogroup you enter, there’s something for you to enjoy.

I’m going to use haplogroup I-FT272214, the haplogroup of my ancestor, James Crumley, confirmed through multiple descendants. His son John’s descendants carry haplogroup I-BY165368 in addition to I-FT272214, which is why there are two detailed haplogroups displayed for this grouping within the Crumley haplogroup project, in addition to the less-refined I-M223.

Getting Started

When you click on Discover, you’ll be asked to register briefly, agree to terms, and provide your email address.

Click “View my report” and your haplogroup report will appear.

Y DNA Haplogroup Report

For any haplogroup you enter, you’ll receive a haplogroup report that includes 7 separate pages, shown by tabs at the top of your report.

Click any image to enlarge

The first page you’ll see is the Haplogroup Report.

On the first page, you’ll find Haplogroup aging. The TMRCA (time to most recent common ancestor) is provided, plus more!

The report says that haplogroup I-FT272214 was “born,” meaning the mutation that defines this haplogroup, occurred about 300 years ago, plus or minus 150 years.

James Crumley was born about 1710. We know his sons carry haplogroup I-FT272214, but we don’t know when that mutation occurred because we don’t have upstream testers. We don’t know who his parents were.

Three hundred years before the birth of our Crumley tester would be about 1670, so roughly James Crumley’s father’s generation, which makes sense.

James’ son John’s descendants have an additional mutation, so that makes sense too. SNP mutations are known to occur approximately every 80 years, on average. Of course, you know what average means…may not fit any specific situation exactly.

The next upstream haplogroup is I-BY100549 which occurred roughly 500 years ago, plus or minus 150 years. (Hint – if you want to view a haplogroup report for this upstream haplogroup, just click on the haplogroup name.)

There are 5 SNP confirmed descendants of haplogroup I-FT272214 claiming origins in England, all of whom are in the Crumley DNA project.

Haplogroup descendants mean this haplogroup and any other haplogroups formed on the tree beneath this haplogroup.

Share

If you scroll down a bit, you can see the share button on each page. If you think this is fun, you can share through a variety of social media resources, email, or copy the link.

Sharing is a good way to get family members and others interested in both genealogy and genetic genealogy. Light the spark!

I’m going to be sharing with collaborative family genealogy groups on Facebook and Twitter. I can also share with people who may not be genealogists, but who will think these findings are interesting.

If you keep scrolling under the share button or click on “Discover More” you can order Y DNA tests if you’re a biological male and haven’t already taken one. The more refined your haplogroup, the more relevant your information will be on the Discover page as well as on your personal page.

Scrolling even further down provides information about methods and sources.

Country Frequency

The next tab is Country Frequency showing the locations where testers with this haplogroup indicate that their earliest known ancestors are found.

The Crumley haplogroup has only 5 people, which is less than 1% of the people with ancestors from England.

However, taking a look at haplogroup R-M222 with many more testers, we see something a bit different.

Ireland is where R-M222 is found most frequently. 17% of the men who report their ancestors are from Ireland belong to haplogroup R-M222.

Note that this percentage also includes haplogroups downstream of haplogroup R-M222.

Mousing over any other location provides that same information for that area as well.

Seeing where the ancestors of your haplogroup matches are from can be extremely informative. The more refined your haplogroup, the more useful these tools will be for you. Big Y testers will benefit the most.

Notable Connections

On the next page, you’ll discover which notable people have haplogroups either close to you…or maybe quite distant.

Your first Notable Connection will be the one closest to your haplogroup that FamilyTreeDNA was able to identify in their database. In some cases, the individual has tested, but in many cases, descendants of a common ancestor tested.

In this case, Bill Gates is our closest notable person. Our common haplogroup, meaning the intersection of Bill Gates’s haplogroup and my Crumley cousin’s haplogroup is I-L1195. The SNP mutation that defines haplogroup I-L1145 occurred about 4600 years ago. Both my Crumley cousin and Bill Gates descend from that man.

If you’re curious and want to learn more about your common haplogroup, remember, you can enter that haplogroup into the Discover tool. Kind of like genetic time travel. But let’s finish this one first.

Remember that CE means current era, or the number of years since the year “zero,” which doesn’t technically exist but functions as the beginning of the current era. Bill Gates was born in 1955 CE

BCE means “before current era,” meaning the number of years before the year “zero.” So 2600 BCE is approximately 4600 years ago.

Click through each dot for a fun look at who you’re “related to” and how distantly.

This tool is just for fun and reinforces the fact that at some level, we’re all related to each other.

Maybe you’re aware of more notables that could be added to the Discover pages.

Migration Map

The next tab provides brand spanking new migration maps that show the exodus of the various haplogroups out of Africa, through the Middle East, and in this case, into Europe.

Additionally, the little shovel icons show the ancient DNA sites that date to the haplogroup age for the haplogroup shown on the map, or younger. In our case, that’s haplogroup I-M223 (red arrow) that was formed about 16,000 years ago in Europe, near the red circle, at left. These haplogroup ancient sites (shovels) would all date to 16,000 years ago or younger, meaning they lived between 16,000 years ago and now.

Click to enlarge

By clicking on a shovel icon, more information is provided. It’s very interesting that I-L1145, the common haplogroup with Bill Gates is found in ancient DNA in Cardiff, Wales.

This is getting VERY interesting. Let’s look at the rest of the Ancient Connections.

Ancient Connections

Our closest Ancient Connection in time is Gen Scot 24 (so name in an academic paper) who lived in the Western Isles of Scotland.

These ancient connections are more likely cousins than direct ancestors, but of course, we can’t say for sure. We do know that the first man to develop haplogroup I-L126, about 2500 years ago, is an ancestor to both Gen Scot 24 and our Crumley ancestor.

Gen Scot 24 has been dated to 1445-1268 BCE which is about 3400 years ago, which could actually be older than the haplogroup age. Remember that both dating types are ranges, carbon dating is not 100% accurate, and ancient DNA can be difficult to sequence. Haplogroup ages are refined as more branches are discovered and the tree grows.

The convergence of these different technologies in a way that allows us to view the past in the context of our ancestors is truly amazing.

All of our Crumley cousin’s ancient relatives are found in Ireland or Scotland with the exception of the one found in Wales. I think, between this information and the haplogroup formation dates, it’s safe to say that our Crumley ancestors have been in either Scotland or Ireland for the past 4600 years, at least. And someone took a side trip to Wales, probably settled and died there.

Of course, now I need to research what was happening in Ireland and Scotland 4600 years ago because I know my ancestors were involved.

Suggested Projects

I’m EXTREMELY pleased to see suggested projects for this haplogroup based on which projects haplogroup members have joined.

You can click on any of the panels to read more about the project. Remember that not everyone joins a project because of their Y DNA line. Many projects accept people who are autosomally related or descend from the family through the mitochondrial line, the direct mother’s line.

Still, seeing the Crumley surname project would be a great “hint” all by itself if I didn’t already have that information.

Scientific Details

The Scientific Details page actually has three tabs.

The first tab is Age Estimate.

The Age Estimate tab provides more information about the haplogroup age or TMRCA (Time to Most Recent Common Ancestor) calculations. For haplogroup I-FT272214, the most likely creation date, meaning when the SNP occurred, is about 1709, which just happens to align well with the birth of James Crumley about 1710.

However, anyplace in the dark blue band would fall within a 68% confidence interval (CI). That would put the most likely years that the haplogroup-defining SNP mutation took place between 1634 and 1773. At the lower end of the frequency spectrum, there’s a 99% likelihood that the common ancestor was born between 1451 and 1874. That means we’re 99% certain that the haplogroup defining SNP occurred between those dates. The broader the date range, the more certain we can be that the results fall into that range.

The next page, Variants, provides the “normal” or ancestral variant and the derived or mutated variant or SNP (Single Nucleotide Polymorphism) in the position that defines haplogroup I-FT272214.

The third tab displays FamilyTreeDNA‘s public Y DNA Tree with this haplogroup highlighted. On the tree, we can see this haplogroup, downstream haplogroups as well as upstream, along with their country flags.

Your Personal Page

If you have already taken a DNA test at FamilyTreeDNA, you can find the new Discover tool conveniently located under “Additional Tests and Tools.”

If you are a male and haven’t yet tested, then you’ll want to order a Y DNA test or upgrade to the Big Y for the most refined haplogroup possible.

Big Y tests and testers are why the Y DNA tree now has more than 50,000 branches and 460,000 variants. Testing fuels growth and growth fuels new tools and possibilities for genealogists.

What Do You Think?

Do you like these tools?

What have you learned? Have you shared this with your family members? What did they have to say? Maybe we can get Uncle Charley interested after all!

Let me know how you’re using these tools and how they are helping you interpret your Y DNA results and assist your genealogy.

_____________________________________________________________

Follow DNAexplain on Facebook, here or follow me on Twitter, here.

Share the Love!

You’re always welcome to forward articles or links to friends and share on social media.

If you haven’t already subscribed (it’s free,) you can receive an email whenever I publish by clicking the “follow” button on the main blog page, here.

You Can Help Keep This Blog Free

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Uploads

Genealogy Products and Services

My Book

Genealogy Books

Genealogy Research

Just Released – Mitochondrial Haplogroup L7 Video!

I’m still VERY excited about the haplogroup L7 discovery. Mitochondrial Eve’s new 100,000-year-old great-granddaughter. So is the rest of the Million Mito Team

We’ve created a short video explaining just why this is so cool.

Paul, Dr. Maier, the Population Geneticist on our Million Mito team did a great job as producer. He’s certainly multi-talented! Thanks Paul.

Please understand that this is “just us,” no professional production, editors or anything like that. You’re seeing the real deal here. This video is something we wanted to do for all of you. We’re excited to tell this amazing story – one that we’ve explained in terms that everyone can understand and enjoy. We want you to love mitochondrial DNA as much as we do.

Please share this video far and wide with your family and friends. Remind them that everyone inherits their mother’s (and only their mother’s) mitochondrial DNA. They can make cool discoveries too.

But wait, there’s more!

Dr. Miguel Vilar’s Article

FamilyTreeDNA just published a guest blog article titled A 100,000Year-Old Human Lineage Rediscovered, written by genetic anthropologist Dr. Miguel Villar.

You’ll recognize Miguel as one of the four Million Mito team members in the video, but you may also remember him as the Senior Program Officer for the National Geographic Society and the Lead Scientist for the Genographic Project.

I think you’ll agree, he’s a great writer too!

What’s Your Story?

Not only is mitochondrial DNA (mtDNA) useful genealogically, it’s the story of all womankind. You don’t have to be a genealogist to appreciate and enjoy your mtDNA journey.

Mitochondrial DNA tells a story about each of us that we would never know otherwise.

The best part is that every single person can test their own mitochondrial DNA to learn more about their family story – and very specifically about their mother’s direct line ancestry that may be eclipsed or overshadowed in autosomal DNA by more recent admixture.

Where does your mitochondrial DNA lead?

What Else Can You Do?

You, your mother, and your maternal siblings all share the same mitochondrial DNA, passed to you by your mother. But what about your father? He inherited HIS mother’s mitochondrial DNA, but you didn’t.

You can discover your paternal grandmother’s mtDNA story by testing your father’s mtDNA, or his maternal line siblings if he’s not available for testing.

Your paternal grandmother’s story is your family story too!

Let me know if you like the video and if it makes mtDNA easier to understand and explain to your relatives. I hope this discovery and video help sew the seeds of curiosity.

_____________________________________________________________

Follow DNAexplain on Facebook, here or follow me on Twitter, here.

Share the Love!

You’re always welcome to forward articles or links to friends and share on social media.

If you haven’t already subscribed (it’s free,) you can receive an email whenever I publish by clicking the “follow” button on the main blog page, here.

You Can Help Keep This Blog Free

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Uploads

Genealogy Products and Services

My Book

Genealogy Books

Genealogy Research

Mitochondrial Eve Gets a Great-Granddaughter: African Mitochondrial Haplogroup L7 Discovered

Such wonderful news today!

We have a birth announcement, of sorts, detailed in our new paper released just today,  “African mitochondrial haplogroup L7: a 100,000-year-old maternal human lineage discovered through reassessment and new sequencing.”

Woohoo, Mitochondrial Eve has a new great-granddaughter!

Back in 2018, Goran Runfeldt and Bennett Greenspan at FamilyTreeDNA noticed something unusual about a few mitochondrial DNA sequences, but there weren’t enough sequences to be able to draw any conclusions. As time went on, more sequences became available, both in the FamilyTreeDNA database and in the academic community, including an ancient sequence.

This group of sequences did not fit cleanly into the phylogenetic tree as structured and seemed to cluster together, but more research and analysis were needed.

Were these unique sequences a separate branch? One branch or several? What would creating that branch do to the rest of the tree?

Given that Phylotree, last updated in 2016, did not contain an applicable branch, what were we to do with these puzzle pieces that really didn’t fit?

These discussions, and others similar, led to the decision to launch the Million Mito Project to update the mitochondrial phylogenetic tree which is now 6 years old and seriously out-of-date. For the record, phylogenetics on this scale is EXTREMELY challenging, which is probably why Phylotree hasn’t been updated, but that’s a topic for another article, another day. Today is the day to celebrate haplogroup L7.

Haplogroup L7

The Million Mito team knew there were lots of candidate haplogroups waiting to be formed near the ends of the branches of the phylotree, but what we didn’t expect was a new haplogroup near the root of the tree.

Put another way, in terms that genealogists are used to, the new branch is Eve’s great-granddaughter.

Haplogroup L now has 8 branches, instead of 7, beginning with L0. We named this new branch haplogroup L7 in order not to disrupt the naming patterns in the existing tree.

Let’s take a look.

I used the phylogenetic tree from our paper and added Eve.

Just to be clear, we aren’t talking literal daughters and granddaughters. These are phylogenetic daughters which represent many generations between each (known) branch. Of course, we can only measure the branches that survived and are tested today or are found in ancient DNA.

The only way we have of discovering and deciphering Eve and her “tree” of descendants is identifying mutations that occurred, providing breadcrumbs back in time that allow us to reconstruct Eve’s mitochondrial DNA sequence.

Those mutations are then carried forever in daughter branches (barring a back-mutation). This means that, yes, you and I have all of those mutations today – in addition to several more that define our individual branches.

You can see that Eve has two daughter branches. One branch, at left, is L0.

Eve’s daughter to the right, which I’ve labeled, is the path to the new L7 branch.

Before this new branch was identified, haplogroup L5 existed. Now, Eve has a new great-granddaughter branch L5’7 that then splits into two branches; L5 and L7.

L5 is the existing branch, but L7 is the new branch that includes a few sequences formerly misattributed to L5.

Even more exciting, the newly discovered haplogroup L7 has sub-branches too, including L7a, L7a1, L7b1 and L7b2.

In fact, haplogroup L7 has a total of 13 sublineages.

How Cool is This?!!

Haplogroup L7 is 100,000 years old. This is the oldest lineage since haplogroup L5 was discovered 20 years ago. To put this in perspective, that’s about the same time the first full sequence mitochondrial DNA test was offered to genealogists.

It took 20 years for enough people to test, and two eagle-eyed scientists to notice something unusual.

Hundreds of thousands of people have had their mitochondrial DNA tested, and so far, only 19 people are assigned to haplogroup L7 or a subgroup.

One of those people, shown as L7a* on the tree above, is 80,000 years removed from their closest relative. Yes, their DNA is hens-teeth rare. No, they don’t have any matches at FamilyTreeDNA, just in case you were wondering😊

However, in time, as more people test, they may well have matches. This is exactly why I encourage everyone to take a mitochondrial DNA test. If someone is discouraged from testing, you never know who they might have matched – or how rare their DNA may be. If they don’t test, that opportunity is lost forever – to them, to other people waiting for a match, and to science.

Are there other people out there with this haplogroup, in either Africa or the diaspora? Let’s hope so!

With so few L7 people existing today, it looks like this lineage might have been on the verge of extinction at some point, but somehow survived and is now found in a few places around the world.

Ancient DNA

One 16,000-year-old ancient DNA sample from Malawi has been reclassified from L5 to L7.

This figure from the paper shows the distribution of haplogroup L within Africa, and the figure below shows the Haplogroup L7 range within Africa, with Tanzania having the highest frequency. Malawi abuts Tanzania on the Southwest corner.

Where in the World?

Checking on the public tree at FamilyTreeDNA, you can see the new L5’7 branch with L7 and sub-haplogroups beneath.

We find L7 haplogroups in present-day testers from:

  • South Africa
  • Kenya
  • Ethiopia
  • Sudan
  • United Arab Emirates
  • Yemen
  • Tanzania

It’s also found in people who live in two European countries now, but with their roots reaching back into Africa. Surprisingly, no known African-Americans have yet tested with this haplogroup. I suspect finding the haplogroup in the Americas is just a matter of time, and testing.

The FamilyTreeDNA customers who are lucky enough to be in haplogroup L7 have had their haplogroup badges updated.

If you are haplogroup L at FamilyTreeDNA, check and see if you have a new badge.

Credit Where Credit is Due

I want to give a big shout-out to my colleagues and co-authors. Dr. Paul Maier (lead author,) Dr. Miguel Vilar and Goran Runfeldt.

I can’t even begin to express the amount of heavy lifting these fine scientists did on the long journey from initial discovery to publication. This includes months of analysis, writing the paper, creating the graphics, and recording a video which will be available soon.

I’m especially grateful to people like you who test their DNA, and academic researchers who continue to sequence mitochondrial DNA in both contemporary and ancient samples. Without testers, there would be no scientific discoveries, nor genealogy matching. If you haven’t yet tested, you can order (or upgrade) a mitochondrial DNA test here.

I also want to thank both Bennett Greenspan, Founder, and President, Emeritus of FamilyTreeDNA who initially greenlit the Million Mito Project in early 2020, and Dr. Lior Rauschberger, CEO who continues to support this research.

FamilyTreeDNA paid the open access fees so the paper is free for everyone, here, and not behind a paywall. If you’re downloading the pdf, be sure to download the supplements too. Lots of graphics and images that enhance the article greatly.

Congratulations to Mitochondrial Eve for this new branch in her family tree. Of course, her family tree is your family and mine – the family of man and womankind!

_____________________________________________________________

Follow DNAexplain on Facebook, here or follow me on Twitter, here.

Share the Love!

You’re always welcome to forward articles or links to friends and share on social media.

If you haven’t already subscribed (it’s free,) you can receive an email whenever I publish by clicking the “follow” button on the main blog page, here.

You Can Help Keep This Blog Free

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Uploads

Genealogy Products and Services

My Book

Genealogy Books

Genealogy Research

Ancestry Only Shows Shared Matches of 20 cM and Greater – What That Means & Why It Matters

Recently, I’ve noticed an uptick in confused people who’ve taken Ancestry’s DNA test.

They are using shared matches, which is a great tool and exactly what they should be doing, but they become confused when no shared matches appear with some specific people.

This is especially perplexing when they know through information sharing or because they manage multiple DNA kits that those two people who both match them actually do share DNA and match each other, meaning they “should” appear on a shared match list. Or worse, yet, conflicting match information is displayed, with one person showing the shared match, but the other person reciprocally does not.

What gives?

That’s exactly what this article addresses. It’s not quite as simple as it sounds, but it’s certainly easier once you understand.

What Are Matches and Shared Matches?

Matches occur when two people match each other. From your perspective as a DNA tester, matches are people who have taken DNA tests and appear on your match list because you share some level of DNA equal to or greater than the match threshold of the vendor in question.

At Ancestry, that minimum matching threshold is 8 cM (centimorgans) of matching DNA.

Individual matches are always one-to-one. Your match list is a list of people who all match you.

So, you match person 1, and you match person 2, individually.

Your matches may or may not also match each other. If they do match each other in addition to matching you, that’s a shared match which is a hint as to a potential common ancestor between all three people.

Shared matches are a list of people who match you PLUS any one other match on your list. In other words, shared matches are three-way matches.

In the diagram above, you can see that you match Match 1 and you also match Match 2. In this case, Match 1 and Match 2 also match each other, so all three of you match each other, but not necessarily on the same segment. Therefore, you’re all three shared matches, as shown in the center of the three circles.

Viewing Shared Matches

To view a list of people who match you and Match 1, you would request shared matches with Match 1 by clicking on “View Match” or “Learn More” on your match list, then on “Shared Matches” on the next screen.

The resulting shared match list consist of people who match you AND Match 1, both. It’s easy to make assumptions about why you have shared matches, but don’t.

Shared Matches are Hints

A shared match CAN mean:

  • That all three people share a common ancestral line.
  • You share a common ancestor with Match 1 and Match 2, but Match 1 and 2 match each other because they share an entirely different ancestor.
  • You match Match 1 because you share DNA from Ancestor A and you match Match 2 because you share DNA from Ancestor B. Match 1 and 2 match each other either because they share one or both of those common ancestors.
  • Match 1 and Match 2 might match because Match 1 and Match 2 share an ancestor that isn’t related to you.
  • That one (or more) of the matches is identical by chance, meaning the DNA combined from two parents in a random way that just happens to match with someone else.

Shared matches are great hints to be sifted for relevance. The operative word here is hint.

What If We Don’t Have Shared Matches?

Conversely, NOT having a shared match doesn’t mean you don’t share a common ancestor.

Sorry about the triple negative. Let me say that another way, because this is important.

Even though you and someone else aren’t on a shared match list, you might still share DNA and you may share a common ancestor, whether you share their DNA or not.

Ancestry’s shared matches work differently than shared matches at other vendors. Before we discuss that, let’s talk about why shared matches are important.

Why Do Shared Matches Matter Anyway?

Matches and shared matches are how genealogists perform two critically important functions:

  • Verifying “known” ancestors. Sometimes paper trails aren’t accurate and certainly, neither are trees.
  • Identifying unknown ancestors. Looking for common families among shared DNA matches is a HUGE hint when tracking down those pesky unknown ancestors.

I wrote about shared matches, here, when Ancestry purged segments under 8 cM, but I think the message about the limitations of shared matches and how the process actually works deserves its own article, especially for new users. Shared matches and segment cM numbers can be quite confusing, but they don’t need to be.

I wrote an article titled DNA Beginnings: Matching at Ancestry and What It Means that includes lots of useful information.

Ok, now let’s look specifically at using shared matches and why sometimes shared matches just don’t seem to make sense.

Matches

By far, the majority of your matches at any vendor will be more distant matches. That’s because you have thousands of distant relatives, most of whom you don’t know (yet).

You’ll only have a few closer relatives.

At Ancestry, I have 102,000+ total matches, of which more than 97,000 are distant matches. Based on these numbers, keep in mind that about 95.74% of my matches are distant, meaning 20 cM or below, and yours probably are too. You’ll need that number later.

Note that 20 cM is Ancestry’s threshold between close matches and distant matches.

That’s about exactly where you’d expect, on average, to see a 20 cM match – generally at or further back than 4th cousins. 20 cM is roughly the 4th to 6th cousin level.

Of course, you won’t match most of your 5th cousins at all, yet you’ll match some with more than 20 cM. That’s just the roll of the genetic dice.

Closer ancestors (meaning closer matches) is also the area of genealogy where much of the lower-hanging fruit has been plucked.

In my case, the closest unknown ancestor in my tree occurs at the 6th generation level and I have 5 or 6 missing sixth-generation ancestors – all females with no surnames. Two have no names at all.

Click to enlarge any image

How Much DNA Do Cousins Share?

One of my priorities as a genealogist is to identify those unknown people, which is why matches, and shared matching at that level are critical for me.

Ancestry tells me that this 20 cM match is likely my 4th-6th cousin.

At DNAPainter, in the Shared cM Tool, you can enter the total cM number of a match, which is the total amount of DNA that you share after Ancestry’s Timber algorithm has been applied. The range of relationship probabilities for 20 cM is shown below.

For a total match of 20 cM with another individual, several relationships ranging between half 3C2R/3C3R and 8th cousins are the most probable relationships at 58%.

For the record, this is total cM, which does not necessarily mean one segment. Ancestry reports the number of segments, but Ancestry does not show you the segment locations, nor do they have a chromosome browser. Without a chromosome browser, you have no way of determining whether or not you match with shared matches on the same segment(s). In other words, there is no triangulation at Ancestry, meaning confirmation of a specific shared DNA segment descended from a common ancestor. You can find triangulation resources, here.

Close Matches

The best way to figure out how you are related to closer matches (assuming you don’t already know them and Ancestry has not found a common ancestor) is using shared matches. Hopefully, you will share matches with people you do know or with whom you’ve already identified your common ancestor.

One of my relatively close DNA matches at Ancestry is Lonnie. I don’t know Lonnie, but it looks like I should because he’s probably a 1st or 2nd cousin. We share 357 cM of DNA over 20 segments.

I thought I knew all of my 1st and 2nd cousins. Let’s see if I can figure out how I’m related to Lonnie.

By clicking on Lonnie’s name on my match list, then on Shared Matches, I can determine that Lonnie and I connect through my Estes and Vannoy lines based on who we both match, which means that our common ancestor is either my paternal grandfather or my great-grandparents, Lazarus Estes and Elizabeth Vannoy.

You can see the notes I’ve made about these matches I share with Lonnie.

Viewing Lonnie’s unlinked tree verifies the ancestral line that shared matches suggest. An unlinked tree means that Lonnie has not linked his DNA test to himself in his tree. Since Ancestry doesn’t know who he is in the tree, they can’t find a common ancestor for me and Lonnie. However, I can by viewing his tree.

Our common ancestor is Lazarus Estes and his wife, Elizabeth Vannoy. Therefore, Lonnie is my 2nd cousin.

That wasn’t difficult, in part because I had already worked on the genealogy of our common matches and Lonnie had a small unlinked tree where I could confirm our common ancestor.

Now let’s move to more distant, not-so-easy matches.

Distant Matches

I’ve spent a lot of time over the years identifying common ancestors with my matches.

When I make that connection, whether or not Ancestry has been able to identify our common ancestor, I make notes about common ancestors and anything else that seems relevant. Notes very conveniently show on my match list so I don’t need to open each match to see how we are related.

Ancestry does identify potential common ancestors using ThruLines. Note the word potential. Ancestry compares the trees of you and your matches searching for common ancestors and suggests connections. It’s up to you to verify. ThruLines are hints, not gospel. Additionally, you may have multiple ancestral links to your matches. Ancestry can only work with the fact that you have a DNA match with someone AND the user-provided trees of your matches.

Ancestry’s ThruLines only reach back a maximum of 7 generations to suggest common ancestors. At 7 generations distance, you’d be a 5th cousin to a descendant who is also 7 generations downstream from that ancestor.

The information from DNAPainter, who utilizes the Shared CM Project compiled data shows that the most likely amount of shared DNA for 5th cousins, is, you’ve guessed it – 20 cM.

Jacob Dobkins is my 7th generation ancestor. I have ThruLines for him and his wife, but not for their parents who are one generation too distant for ThruLines. I’d LOVE to see Ancestry extend ThruLines another 2 or 3 generations.

ThruLines matches me with people who descend from Jacob through his other children. Other children are important because the only ancestors you share with those people are (presumably) that ancestral couple.

Matches with Jacob’s descendants range from 8 cM (the smallest amount Ancestry reports) to 32 cM.

Here’s an example.

Ancestry displays some shared matches with all of your matches, regardless of the size of your match to that person. However, Ancestry ONLY shows shared matches to a third person if you share more than 20 cM of DNA with that third person.

For example, I match KO with 8 cM of DNA. Ancestry shows my shared matches with KO, below.

I only have 3 shared matches with KO. I only match KO at 8 cM, but I match our shared matches at 39, 31 and 21 cM, respectively.

Ancestry does NOT show shared matches below 20 cM, so it’s unknown how many additional shared matches KO and I actually have if shared matches less than 20 cM were displayed.

Perspective is Critical

Whether you see a shared match or not is sometimes a matter of perspective, meaning which of two people you request shared matches with.

In this case, I requested shared matches with KO. I only share 8 cM of DNA with KO, but that doesn’t matter. The amount of DNA you share with the person you’re requesting shared matches with is irrelevant.

Ancestry’s Shared Matches with KO include Ker

I will see shared matches with KO to anyone we mutually share as matches above 20 cM, including Ker.

If I request shared matches with Ker, with whom I share 39 cM of DNA, I will see all of our mutual matches at 20 cM (or greater) of DNA. However, that does NOT include KO because I only share 8 cM of DNA with KO.

This restriction applies regardless of how much DNA KO and Ker share, which is an unknown to me of course.

Ancestry’s Shared Matches with Ker does NOT include KO

Nothing has changed between these matches, yet KO does not appear on my shared matches list with Ker when I request shared matches with Ker.

I still share 8 cM with KO and 39 cM with Ker. KO and Ker still both match each other. The only difference is that Ker shows up on my shared match list with KO because I share more than 20 cM with Ker. However, when I request a match list with Ker, KO does NOT appear because I only share 8 cM with KO.

This is the source of the confusion and often, why people disagree about shared matches. It’s kind of a “now you see it, now you don’t” situation.

If a person shows as a shared match depends on:

  1. Whether the third person actually does share DNA with the tester and the person they’ve asked for shared matches with
  2. Whether the third person shares 20 cM DNA or more with the tester, the person requesting the shared match list with one of their matches

Whether someone appears on a shared match list can literally be a matter of perspective unless the match and the shared matches all match the tester at 20 cM or larger.

Another Example

Let’s look at a larger match to a descendant of the same ancestor.

I share exactly 20 cM with Joyce, my 5C1R.

Viewing my shared matches with Joyce, I match 50 other people that she matches as well.

I only share 25 cM of DNA with the smallest match with Joyce. Apparently, there are no matches with Joyce with whom I share between 20 and 25 cM of DNA.

Bottom Line

Here’s the bottom line.

Ancestry NEVER shows any shared matches below 20 cM from the perspective of the tester, meaning people who match you and someone else, both.

If you recall our earlier math, that means that approximately 95.74% of my shared matches aren’t shown.

This puts shared matches in a different perspective because now I realize just how many matches I’m not seeing.

Why is This Confusing?

If you aren’t aware of this shared match limitation, and that a majority of your shared matches are actually below 20 cM, you may interpret shared match results to mean you actually DON’T share specific matches with that other person. That isn’t necessarily true, as we saw above with KO and Ker.

Furthermore, let’s say you manage your DNA kit plus 3 more, A, B and C. Because you manage all 4 kits, that means you can see the results for all 4 people.

  • A – 10 cM
  • B – 20 cM
  • C – 40 cM

From the perspective of YOUR kit, you will see some shared matches FOR all of those matches.

What you won’t see is shared matches if you don’t match the shared match (third person) at 20 cM or greater.

Always remember, shared match information at Ancestry is ALWAYS from the perspective of your DNA kit combined with the person with whom you request the match.

I’ve put this information in a grid because that’s how I make sense of things like this.

Here are your matches. When you click on shared matches with person A who you match at 10 cM, you’ll see both person B and person C as shared matches since you match both of those people at 20 cM or larger. You WILL see 20 cM shared matches, but you will not see 19 cM shared matches.

When you request shared matches for A, you will see both B and C.

When you request shared matches with kits B and C, you will not see A because you only match them at 10 cM.

However, from the perspective of DNA kits A, B and C, shared matches look different.

Let’s look at shared matches from the perspective of Kits A, B and C.

Kit A matches you, Kit B and C, but can only see Kit B as a shared match because matches with you and Kit C are under 20 cM.

Kit B doesn’t match C at all, so they clearly won’t have shared matches. However, they do match you and Kit A, both at 20 cM and over, so Kit B will see you as a shared match with Kit A, and Kit A as a shared match with you.

Kit C doesn’t match Kit B, so no shared matches with that person at all. Kit C does match you and Kit A. However, when Kit C clicks on shared matches for you, Kit A doesn’t show up because they only match Kit A on 9 cM. When Kit C clicks on Kit A for shared matches, you ARE listed as a shared match because you share 40 cM of DNA with Kit C.

There’s no way to discern whether two of your matches match each other unless they show as a match in the shared match tool. You can’t tell if their absence on the shared match list means they actually don’t match, or their shared match absence is because they match you at less than 20 cM.

Whew, that was a mouthful.

You may need to refer back to this from time to time if you’re confused by your shared matches at Ancestry.

If you need to remember rules, remember this.

  1. You can obtain shared matches with yourself plus any match, regardless of how much or how little DNA you share with that one match. Prove this to yourself by finding a match under 20 cM, like my 8 cM match, and viewing your shared matches.
  2. No one will show on a shared match list with another person unless they match you at 20 cM or greater. Prove this to yourself by viewing the smallest shared match with anyone.

Strategy

The takeaway of this is if you have a larger (20 cM or over) and smaller match (under 20 cM), always request shared matches from the perspective of the smaller match because the smaller match won’t show up as a shared match on any shared match list.

The only way you can see shared matches that includes people under 20 cM is to request to view shared matches with individual people who match you below 20 cM. 

In my case, I will never see KO on any shared match list because I only match KO at 8 cM. However, I can request my shared matches with KO in which case I’ll see all 20 cM or greater shared matches with KO.

Alternatives

Every vendor provides a shared match feature, and each functions differently.

In the chart below, I’ve provided basic shared match information for each vendor.

If you’re interested in uploading your DNA file from Ancestry or another vendor, I’ve provided upload/download step-by-step instructions for each vendor, here.

_____________________________________________________________

Follow DNAexplain on Facebook, here or follow me on Twitter, here.

Share the Love!

You’re always welcome to forward articles or links to friends and share on social media.

If you haven’t already subscribed (it’s free,) you can receive an email whenever I publish by clicking the “follow” button on the main blog page, here.

You Can Help Keep This Blog Free

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Uploads

Genealogy Products and Services

My Book

Genealogy Books

Genealogy Research

Million Mito Project Team – Introduction and Progress Update

Let me introduce you to the Million Mito Project team.

Left to right, Goran Runfeldt, Dr. Paul Maier, me, and Dr. Miguel Vilar. And yes, I know we look kind of like a band😊. The Merry Mito Band maybe, except, trust me, I can’t sing.

Yes, we finally, finally got to meet in person recently, and let me tell you, that was one joyful meeting. I hadn’t realized that while I know everyone, not everyone else had met in person before.

We have been working for almost two years together via Zoom, but separately. Just 10 days after the Million Mito Project was announced, we went into Covid lockdown.

It’s difficult to work remotely on such a huge collaborative project, but we have been making inroads, albeit slower than we had initially hoped.

Complicating this was the merger of FamilyTreeDNA with myDNA in January of 2021, with Bennett Greenspan stepping down as the CEO in that process. Bennett greenlit the Million Mito project initially. (Thank you, Bennett!)

Thankfully, the new CEO, Dr. Lior Rauschberger continued that greenlight without hesitation as soon our team was able to inform him about this wonderful scientific project that was underway. (Thank you, Lior!)

I can’t tell you what a HUGE relief that was.

While all change is challenging, and complicated by the Covid landscape, life events, and geographic distance, that merger really was the right decision. Lior is committed to scientific research, discovery, and the genealogy marketspace. He’s looking to expand, not contract.

You’re probably wondering where we are now in the Million Mito process.

Million Mito Project Update

I’d like to provide a brief update.

  • We have an academic paper in the final stages of the submission process, but this paper is not the final tree. It is, however, something extremely cool and important to the history of womankind! I can’t say more until publication, but I’ll write an article when the paper is published.
  • The team hopes to work with a million samples between all sources including FamilyTreeDNA testers, research-consented Genographic samples, Genbank, and other academic samples. Not all samples from those sources are full mitochondrial sequences, or necessarily pass our QC checks.

If you haven’t yet taken a full sequence test, you can help reach the one million goal by ordering a mitochondrial DNA test at FamilyTreeDNA, here. If you tested at a lower level some years back, please sign on to your account and upgrade so you can be a part of this scientific frontier.

  • We discovered that the authors of Phylotree never documented the “recipe” for reconstructing the tree behind the scenes, so we can’t exactly use the recipe for Phylotree as the basis for constructing a future tree.
  • We have been in the process of writing phylogenetic software that arrives at a similar tree to use as a baseline reference structure in order to preserve as many of the current Phylotree haplogroup names as possible.

Hand curation and placement is possible for hundreds or a few thousand samples, but it’s not possible for large numbers. While phylogenetic software to do this kind of work has existed for a long time, it typically can’t handle huge trees like what we are building.

Phylogenetic methods also struggle with highly recurrent mutations, and rapid star-burst expansions that we see on the human trees. A phylogenetic problem of this magnitude requires lots of innovations to correctly interpret lineage history from complex mutations.

Automated software to handle very large numbers of sequences must be adapted or developed.

  • Furthermore, simply building upon an existing scaffold without automating the process does not provide an ongoing, sustainable procedure to discover where new dividing branches are discovered internally within the tree, versus at the tips. In other words, adding new branches based on common mutations is only easy when you’re simply appending a new haplogroup to an existing one.

For example, I might have a new haplogroup J1c2f1 derived from J1c2f. That’s easy. It’s another matter entirely if haplogroup J1 itself, high up in the tree, were broken into multiple new branches. Only automated software can “reconstruct” the tree regularly to discover new major branches as the results of more testers become available.

Challenges

Let me share some examples of the kinds of challenges that we’ve encountered. Not only are these interesting, but they are also educational.

These figures are from Paul Maier’s RootsTech presentation, which I strongly recommend that you view, here.

Mitochondrial DNA is both fascinating and habit-forming. The more you know, the more you want to know.

Let’s start with the basics. Haplogroups are defined by one or more mutations that everyone upstream does NOT have, and everyone downstream DOES have.

Pretty simple so far, right!

Haplogroup-Defining Mutations

Here’s an example of a nice simple mutation that is one of the multiple mutations that define haplogroup L1, near the base of the mitochondrial tree (Mitochondrial Eve) in the center. At location 3666, the “normal” value is G, but in this branch, the G in that position has been replaced by an A.

You can see that the other haplogroups shown in the circle by black dots don’t have the G-to-A mutation at location 3666, but the red dot locations do carry that mutation. Therefore, G3666A is one of the mutations that defines haplogroup L1. Haplogroups can be defined by only one unique mutation, or multiple mutations.

Multiple Haplogroup-Defining Mutations

Haplogroups with multiple mutations that define that specific haplogroup are candidates to be split into multiple branches forming new haplogroups at some point in the future when other people test who have:

  1. One or the other of those mutations if there are only two
  2. A subset of the mutations
  3. But not all of the mutations

Click on images to enlarge

For example, in the view of the public mitochondrial haplotree at FamilyTreeDNA which you can view here, you see that haplogroup L1 is defined by a total of 6 mutations. Someday, people may test that only have half (or a portion) of those mutations which would cause haplogroup L1 to split or branch into two separate haplogroups.

Unstable Mutations

Some mitochondrial locations are unstable, such as 16519C, along with a few other hypervariable locations. By unstable, I mean that they have mutated back and forth in the tree many times. The historical branching patterns of such unstable mutations can be difficult to decipher (the technical term is “saturation”), suggesting perhaps that they should not be the foundation for a new haplogroup.

Do we ignore those unstable locations entirely?

After discounting those well-known unstable locations, we still find some mutations, often in the HVR (hypervariable) regions that occur close to 100 times in the full tree.

This mutation at location 150 from C to T occurred four distinct times just in this small subset of haplogroup L. You can see the 4 locations I’ve bracketed with red boxes.

Is C150T stable enough to form a haplogroup? Multiple haplogroups? Should it be used high in the tree if this affects the complete downstream structure?

This same mutation occurs additional times further downstream in the tree, as well.

Reverse Mutations

Of course, some haplogroups are defined by reverse mutations, where the original mutation reverts back to its original state.

What about locations that have as many as 3 reverse mutations, which means that one location mutates back and forth 6 times in total? Kind of like a drunken sailor zigging and zagging along the street.

If we counted each mutation and reversal as a new haplogroup, we would have 6 new haplogroups based on this one single location in one parent haplogroup. Is that accurate, or should we ignore it altogether?

Here’s an example of one mutation and a corresponding back mutation.

In this scenario, the mutation of location 7055 from A to G occurred once in the formation of haplogroup L1. However, a back mutation took place, signified by the ! (exclamation mark) after the A, which is a defining mutation for haplogroup L1c3. All of the other L1c haplogroups still carry the A to G mutation, while L1c3 does not.

In some scenarios, the same location bounces back and forth. Should it still be counted as a haplogroup defining mutation, or is it simply “noise”?

Heteroplasmies

How do heteroplasmies play into this scenario?

Heteroplasmies occur when more than one value is discerned in an individual’s DNA at a specific location. Heteroplasmies do not define haplogroups, but they are reported in your personal results.

To be reported as a heteroplasmy, both values need to be detected at a level of over 20%. In the above scenario, if both G and A were found greater than 20% of the time, it would be counted at a heteroplasmy with a special notation.

For example, if G and A are both found more than 20% of the time, the notation would be R instead of either G or A. If the location was G7055, above, and G and A were both found above 20%, the notation would be G7055R.

However, if G was found 81% of the time or more, then it would be counted as G, which is “normal,” and if A was found 81% of the time or more, then the value would be reported as A, a mutation. If we see the normal state of G, then an A, then a G, is that a mutation and a back mutation? How many samples would need to contain that back mutation to count it as a mutation and not an aberration, an undetected borderline heteroplasmy slipping back and forth over the threshold, or simply noise?

Transitions Versus Transversions

There are two types of mutations, transitions and transversions, that probably should be weighted differently – but how differently, and why?

Some types of mutations occur more easily than others and are therefore more common. Paul explains this very well in his RootsTech video, but in a nutshell, transitions between T/C and A/G are much more common than transversions between A/C, G/T, C/G, and A/T. Therefore, transversions are noted with a small letter, shown above as T7624a.

In phylogenetics, the rarer mutation which is chemically less likely to occur (transversion) is weighted more heavily than the likelier mutations (transitions).

Insertions

Insertions are another type of challenge. Insertions happen when extra DNA is inserted at a specific location, kind of like the genetic equivalent of cutting in line.

In this graphic, we see that at location 5899, there’s an extension of .XC, written as 5899.XC. This means that at this location, you’ll find an unknown or varying number of additional Cs inserted. Paul showed several example sequences in the box at upper left. In some people who have this mutation, there are only one or two inserted Cs. In other people, there are several Cs, shown in the bottom two sequences.

You might recognize this as a phenomenon similar to Y DNA STRs which are short tandem repeats. Of course, we don’t use STRs for haplogroup identification in Y DNA. How should we handle insertions, especially multiple insertions, in building the Mitotree?

Deletions

We see deletions of DNA too, indicated by a small “d” after the location. In some cases, we find large deletions.

At location 8281, there is a 9 base-pair deletion (8281 through 8289) that is one of the haplogroup defining mutations for haplogroup L0a2. We find a 9 base-pair deletion in exactly the same location again within subclades of haplogroups B and U.

Is there something about this specific location that makes it more prone to deletions, and specifically a deletion of exactly 9 base pairs?

Seeking Answers

Of course, we’re seeking all of these answers.

The team has been writing code to create structural trees based on various scenarios and trying to determine which ones make the most sense, all factors considered.

The current official tree, meaning the 2016 Build 17 version of Phylotree, is based on about 8,000 samples. Working with one million versus 8,000 is a challenge that ramps exponentially, necessitating substantial computing power.

Working with 125 times more data provides amazing potential, but it has also introduced challenges that never had to be addressed before. It’s evident, to us at least, why Phylotree wasn’t updated after 2016. The tools simply don’t exist.

Sneak Peek

We fully expect hundreds if not thousands of new haplogroups to form. Today, Paul’s haplogroup is U5a2b2a which was formed about 5,000 years ago during the Bronze Age.

The haplogroup itself is useful to determine roughly where your ancestors were at that time, and often provide information about more recent population group history, but you need mitochondrial DNA matching to provide more genealogically useful information.

Paul’s test results show that he has 8 extra mutations, which means those mutations are in addition to his haplogroup-defining mutations. These extra mutations are what make genealogical matching so useful.

Paul has 16 full sequence matches that match him at a genetic distance of 3 mutations or less, although due to privacy restrictions at FamilyTreeDNA, we can’t see which matches share which mutations.

Given that Paul has 8 extra mutations, this means that it’s possible that one or more new haplogroups will be formed using some or all of those 8 extra mutations, and that those people who match him at a GD of 3 or less will very likely be members of a newly formed haplogroup.

Here’s a comparison of Paul’s haplogroup today, at left, with the newly created U5a2b2a branch and resulting subclades in a beta version of our experimental Mitotree, at right. This moves Paul’s new haplogroup, the pink node at right, from 5,000 to 500 years ago which is clearly within a genealogically relevant timeframe.

The single haplogroup, U5a2b2a, now has been expanded to 7 subgroups. If U5a2b2a is representative of the expansion capability of the entire tree, that’s a 7-fold increase.

Of Paul’s 16 matches, those with the same new haplogroup are those where he needs to focus his genealogical research.

Where Are We?

This is not a commitment, but we expect to release a sneak preview of the new Mitotree this year.

If you have extra or missing mutations, especially in the coding region, you and your close matches may very well receive a new, expanded haplogroup.

Highly refined haplogroups will improve the ability to use mitochondrial DNA for genealogical purposes – similar to what the Big Y-700 SNP testing and the expanded haplotree have done for Y DNA.

Like with Y DNA, you’ll want to use your new haplogroup in combination with genealogical trees.

The more people that test, the more success stories emerge, and the more people that WILL test. Just think what would happen if everyone who took a Y or autosomal DNA test also took a mitochondrial DNA test. We’d be bulldozing through brick walls every day.

I don’t know about you, but I have so many women in my trees with no parents. I need more tools and can hardly wait.

Resources

The new Mitotree is fueled by the Million Mito Project which is fueled by full sequence DNA testing, so please purchase yours today.

And yes, in case you were wondering, the new Mitotree will be free and public, just like the existing Mitochondrial DNA Tree and Y DNA Tree are at FamilyTreeDNA today.

You can read more about the Million Mito project here and here.

You can watch Paul’s Million Mito RootsTech presentation, here.

Paul, Miguel and I will be co-presenting Mitochondrial DNA Academy on Saturday, April 23, during the ECCGC Conference which you can read about here and register here.

_____________________________________________________________

Follow DNAexplain on Facebook, here or follow me on Twitter, here.

Share the Love!

You’re always welcome to forward articles or links to friends and share on social media.

If you haven’t already subscribed (it’s free,) you can receive an email whenever I publish by clicking the “follow” button on the main blog page, here.

You Can Help Keep This Blog Free

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Uploads

Genealogy Products and Services

My Book

Genealogy Books

Genealogy Research