Concepts – The Faces of Endogamy

Recently, while checking Facebook, I saw this posting from my friend who researches in the same Native admixed group of families in North Carolina and Virginia that I do. Researchers have been trying for years to sort through these interrelated families. As I read Justin’s post, I realized, this is a great example of endogamy and often how it presents itself to genealogists.

I match a lot of people from the Indian Woods [Bertie County, NC] area via DNA, with names like Bunch, Butler, Mitchell, Bazemore, Castellow, and, of course, Collins. While it’s hard to narrow in on which family these matching segments come from, I can find ‘neighborhoods’ that fit the bill genetically. This [census entry] is from near Quitsna in 1860. You see Bunch, Collins, Castellow, Carter, and Mitchell in neighboring households.

Which begs the question, what is endogamy, do you have it and how can you tell?

Definition

Endogamy is the practice or custom or marrying within a specific group, population, geography or tribe.

Examples that come to mind are Ashkenazi Jews, Native Americans (before European and African admixture), Amish, Acadians and Mennonite communities.

Some groups marry within their own ranks due to religious practices. Jewish, Amish and Mennonite would fall under this umbrella. Some intermarry due to cultural practices, such as Acadians, although their endogamy could also partly be attributed to their staunch Catholic beliefs in a primarily non-Catholic region. Some people practice endogamy due to lack of other eligible partners such as Native Americans before contact with Europeans and Africans.  People who live on  islands or in villages whose populations were restricted geographically are prime candidates for endogamy.

In the case of Justin’s group of families who were probably admixed with Native, European and African ancestors, they intermarried because there were socially no other reasonable local options. In Virginia during that timeframe, mixed race marriages were illegal. Not only that, but you married who lived close by and who you knew – in essence the neighbors who were also your relatives.

Endogamy and Genetic Genealogy

In some cases, endogamy is good news for the genealogist. For example, if you’re working with Acadian records and know which Catholic church your ancestors attended. Assuming those church records still exist, you’re practically guaranteed that you’ll find the entire family because Acadians nearly always married within the Acadian community, and the entire Acadian community was Catholic. Catholics kept wonderful records. Even when the Acadians married a Native person, the Native spouse is almost always baptized and recorded with a non-Native name in the Catholic church records, which paved the way for a Catholic marriage.

In other cases, such as Justin’s admixed group, the Brethren who notoriously kept no church records or the Jewish people whose records were largely destroyed during the Holocaust, endogamy has the opposite effect – meaning that actual records are often beyond the reach of genealogists – but the DNA is not.

It’s in cases like this that people reach for DNA to help them find their families and connections.

What Does Endogamy Look Like?

If you know nothing about your heritage, how would you know whether you are endogamous or not? What does it look like? How do you recognize it?

The answer is…it depends. Unfortunately, there’s no endogamy button that lights up on your DNA results, but there are a range of substantial clues.  Let’s divide up the question into pieces that make sense and look at a variety of useful tools.

Full or Part?

First of all, fully and partly endogamous ancestry, and endogamy from different sources, has different signs and symptoms, so to speak.

A fully endogamous person, depending on their endogamy group, may have either strikingly more than average autosomal DNA matches, or very few.

Another factor will be geography, where you live, which serves to rule out some groups entirely. If you live in Australia, your ancestors may be European but they aren’t going to be Native American.

How many people in your endogamous group that have DNA tested is another factor that weighs very heavily in terms of what endogamy looks like, as is the age of the group. The older the group, generally the more descendants available to test although that’s not always the case. For example warfare, cultural genocide and disease wiped out many or most of the Native population in the United States, especially east of the Mississippi and particularly in the easternmost seaboard regions.

Because of the genocide perpetrated upon the Jewish people, followed by the scattering of survivors, Jewish descendants are inclined to test to find family connections. Jewish surnames may have been changed or not adopted in some cases until late, in the 1800s, and finding family after displacement was impossible in the 1940s for those who survived.

Let’s look at autosomal DNA matches for fully and partly endogamous individuals.

Jewish people, in particular Ashkenazi, generally have roughly three times as many matches as non-endogamous individuals.

Conversely, because very few Native people have tested, Native testers, especially non-admixed Native individuals, may have very few matches.

It’s ironic that my mother, the last person listed, with two endogamous lines, still has fewer matches than I do, the first person listed.  This is because my father has deep colonial roots with lots of descendants to test, and my mother has recent immigration in her family line – even though a quarter of her ancestry is endogamous.

To determine whether we are looking at endogamy, sometimes we need to look for other clues.

There are lots of ways to discover additional clues.

Surnames

Is there a trend among the surnames of your matches?

At the top of your Family Finder match page your three most common surnames are displayed.

A fully endogamous Jewish individual’s most common surnames are shown above. If you see Cohen among your most common surnames, you are probably Jewish, given that the Kohanim have special religious responsibilities within the Jewish faith.

Of course, especially with autosomal DNA, the person’s current surname may not be indicative, but there tends to be a discernable pattern with someone who is highly endogamous. When someone who is fully endogamous, such as the Jewish population, intermarries with other Jewish people, the surnames will likely still be recognizably Jewish.

Our Jewish individual’s first matching page, meaning his closest matches, includes the following surnames:

  • Cohen
  • Levi
  • Bernstein
  • Kohn
  • Goldstein

The Sioux individual only has 137 matches, but his first page of matches includes the following surnames:

  • Sunbear
  • Deer With Horns
  • Eagleman
  • Yelloweyes
  • Long Turkey
  • Fire
  • Bad Wound
  • Growing Thunder

These surnames are very suggestive of Native American ancestry in a tribe that did not adopt European surnames early in their history. In other words, not east of the Mississippi.

At Family Tree DNA, every person has the opportunity to list their family surnames and locations, so don’t just look at the tester’s surname, but at their family surnames and locations too. The Ancestral Surname column is located to the far right on the Family Finder matches page. If you can’t see all of the surnames, click on the person’s profile picture to see their entire profile and all of the surnames they have listed.

Please note that you can click to enlarge all graphics.

If you haven’t listed your family surnames, now would be a good time. You can do this by clicking on the orange “Manage Personal Information” link near your profile picture on the left of your personal page.

The orange link takes you to the account settings page. Click on the Genealogy tab, then on surnames. Be sure to click the orange “save” when you are finished.

Partial Endogamy

Let’s take a look at a case study of someone who is partially endogamous, meaning that they have endogamous lines, but aren’t fully endogamous. My mother, who is the partially endogamous individual with 1231 matches is a good example.

Mother is a conglomeration of immigrants. Her 8 great-grandparents break down as follows:

In mother’s case, a few different forces are working against each other. Let’s take a look.

The case of recent immigration from the Netherlands, in the 1850s, would serve to reduce mother’s matches because there has been little time in the US for descendants to accrue and test. Because people in the Netherlands tend to be very reluctant about DNA testing, very few have tested, also having the effect of reducing her number of matches.

Mother’s Dutch ancestors were Mennonites, an endogamous group within the Netherlands, which would further reduce her possibilities of having matches on these lines since she would be less likely to match the general population and more likely to match individuals within the endogamous group. If people from the Mennonite group tested, she would likely match many within that group. In other words, for her to find Dutch matches, people descended from the endogamous Dutch Mennonite population would need to test. At Family Tree DNA, there is a Low Mennonite Y DNA and Anabaptist autosomal DNA project both, but these groups tend to attract the Mennonites that migrated to Russia and Poland, not the group that stayed in the Netherlands. Another issue, at least in mother’s case, is that her Mennonite relatives “seem” to have been later converts, not part of the original Mennonite group – although it’s difficult to tell for sure in the records that exist.

Mother’s Kirsch and Drechsel ancestors were also recent immigrants in the 1850s, from Germany, with very few descendants in the US today. The villages from where her Kirsch ancestors immigrated, based on the church records, did tend to be rather endogamous.  However, that endogamy would only have reached back about 200 years, as far as the 30 Years’ War when that region was almost entirely, if not entirely, depopulated. So while there was recent endogamy, there (probably) wasn’t deep endogamy. Of course, it would require someone from those villages to test so mother could have matches before endogamy can relevant. DNA testing is not popular in Germany either.

Because of recent immigration, altogether one half of mother’s heritage would reduce her number of matches significantly. Recent immigrants simply have fewer descendants to test.

On the other hand, mother’s English line has been in the US for a long time, some since the Mayflower, so she could expect many matches from that line, although they are not endogamous. If you’re thinking to yourself that deep colonial ancestry can sometime mimic endogamy in terms of lots of matches, you’re right – but still not nearly to the level of a fully endogamous Jewish person.

Mother’s Acadian line has been settled in North America in Nova Scotia since the early 1600s, marrying within their own community, mixing with the Native people and then scattering in different directions after 1755 when they were forcibly removed. Acadians, however, tended to remain in their cultural groups, even after relocation. Many Acadian descendants DNA test and all Acadians descend from a limited and relatively well documented original population. That level of documentation is very unusual for endogamous groups. Acadian surnames are well known and are French. The best Acadian genealogical resource in is Karen Theriot’s comprehensive tree on Rootsweb in combination with the Mothers of Acadia DNA project at Family Tree DNA. I wish there was a similar Fathers of Acadia project.

Mother’s Brethren line is much less well documented due to a lack of church records. The Brethren community immigrated in the early 1700s from primarily Switzerland and Germany, was initially relatively small, lived in clusters in specific areas, traveled together and did not marry outside the Brethren faith. Therefore, Brethren heritage and names also tend to be rather specific, but not as recognizable as Acadian names. After all, the Brethren were German/Swiss and in mother’s case, she also has another 1/4th of her heritage that are recently immigrated Germans – so differentiating one German group from the other can be tricky. The only way to tell Brethren matches from other German matches is that the Brethren also tend to match each other.

In Common With

If you notice a group of similar appearing surnames, use the ICW (in common with) tool at Family Tree DNA to see who you match in common with those individuals. If you find that you match a whole group of people with similar surnames or geography, contact your matches and ask if they know any of the other matches and how they might be related. I always recommend beginning with your closest matches because your common ancestor is likely to be closer in time than people who match you more distantly.

In the ICW match example below, all of the matches who do show ancestral surnames include Acadian surnames and/or locations.

Acadians, of course, became Cajuns in Louisiana where one group settled after their displacement in Nova Scotia. The bolded surnames match surnames on the tester’s surname list.

The ICW tools work particular well if you know of or can identify one person who matches you within a group, or simply on one side of your family.

Don Worth’s Autosomal DNA Segment Analyzer is an excellent tool to genetically group your matches by chromosome. It’s then easy to use the chromosome browser at Family Tree DNA to see which of these people match you on the same segments. These tools work wonderfully together.

The group above is an Acadian match group. By hovering over the match names, you can see their ancestral surnames which make the Acadian connection immediately evident.

The Matrix

In addition to seeing the people you match in common with your matches by utilizing the ICW tool at Family Tree DNA, you can also utilize the Matrix tool to see if your matches also match each other. While this isn’t the same as triangulation, because it doesn’t tell you if they match each other on the same exact segment, it’s a wonderful tool, because in the absence of cooperation or communication from your matches to determine triangulation between multiple people, the Matrix is a very good secondary approach and often predicts triangulation accurately.

In the Matrix, above, the blue boxes indicates that these individuals (from your match list) also match each other.

For additional information on various autosomal tools available for your use, click here to read the article, Nine Autosomal Tools at Family Tree DNA.

MyOrigins

Everyone who takes the Family Finder test also receives their ethnicity estimates on the MyOrigins tab.

In the case of our Jewish friend, above, his MyOrigins map clearly shows his endogamous heritage. He does have some Middle Eastern region admixture, but I’ve seen Ashkenazi Jewish results that are 100% Ashkenazi Jewish.

The same situation exists with our Sioux individual, above. Heavily Native, removing any doubt about his ancestry.

However, mother’s European admixture blends her MyOrigins results into a colorful but unhelpful European map, at least in terms of determining whether she is endogamous or has endogamous lines.

European endogamous admixture, except for Jewish heritage, tends to not be remarkable enough to stand out as anything except European heritage utilizing ethnicity tools. In addition, keep in mind that DNA testing in France for genealogy is illegal, so often there is a distinct absence in that region that is a function of the lack of testing candidates. Acadians may not show up as French.

Ethnicity testing tends to be excellent at determining majority ethnicity, and determining differences between continental level ethnicity, but less helpful otherwise. In terms of endogamy, Jewish and Native American tend to be the two largest endogamous groups that are revealed by ethnicity testing – and for that purpose, ethnicity testing is wonderful.

Y and Mitochondrial DNA and Endogamy

Autosomal tools aren’t the only tools available to the genetic genealogist. In fact, if someone is 100% endogamous, or even half endogamous, chances are very good that either the Y DNA for males on the direct paternal line, or the mitochondrial DNA for males and females on the direct matrilineal line will be very informative.

On the pedigree chart above, the blue squares represent the Y DNA that the father contributes to only his sons and the red circles represent the mitochondrial DNA (mtDNA) that mothers contribute to both genders of their children, but is only passed on by the females.

By utilizing Y and mtDNA testing, you can obtain a direct periscope view back in time many generations, because the Y and mitochondrial DNA is preserved intact, except for an occasional mutation. Unlike autosomal DNA, the DNA of the other parent is not admixed with the Y or mitochondrial DNA. Therefore, the DNA that you’re looking at is the DNA of your ancestors, generations back in time, as opposed to autosomal DNA which can only reliably reach back 5 or 6 generations in terms of ethnicity because it gets halved in every generation and mixed with the DNA of the other parent.

With autosomal DNA, we can see THAT it exists, but not who it came from.  With Y and mtDNA DNA, we know exactly who in your tree that specific DNA came from

We do depend on occasional Y and mtDNA mutations to allow our lines to accrue enough mutations to differentiate us from others who aren’t related, but those mutations accrue very slowly over hundreds to thousands of years.

Our “clans,” over time, are defined by haplogroups and both our individual matches and our haplogroup or clan designation can be very useful. Your haplogroup will indicate whether you are European, Jewish, Asian, Native American or African on the Y and/or mtDNA line.

In cases of endogamous groups where the members are known to marry only within the group, Y and mtDNA can be especially helpful in identifying potential families of origin.  This is evident in the Mothers of Acadia DNA project as well a particular brick wall I’m working on in mother’s Brethren line. Success, of course, hinges on members of that population testing their Y or mtDNA and being available for comparison.

Always test your Y (males only) and mitochondrial DNA (males and females.) You don’t know what you don’t know, and sometimes those lines may just hold the key you’re looking for. It would be a shame to neglect the test with the answer, or at least a reasonably good hint! Stories of people discovering their ethnic heritage, at least for that line, by taking a Y or mtDNA test are legendary.

Jewish Y and Mitochondrial DNA

Fortunately, for genetic genealogists, Jewish people carry specific sub-haplogroups that are readily identified as Jewish, although carrying these subgroups don’t always mean you’re Jewish. “Jewish” is a religion as well as a culture that has been in existence as an endogamous group long enough in isolation in the diaspora areas to develop specific mutations that identify group members. Furthermore, the Jewish people originated in the Near East and are therefore relatively easy, relative to Y and mtDNA, to differentiate from the people native to the regions outside of the Near East where groups of Jewish people settled.

The first place to look for hints of your heritage is your main page at Family Tree DNA. First, note your haplogroups and any badges you may have in the upper right hand corner of your results page.

In this man’s case, the Cohen badge is this man’s first clue that he matches or closely matches the known DNA signature for Jewish Cohen men.

Both Y DNA and mitochondrial DNA results have multiple tabs that hold important information.

Two tabs, Haplogroup Origins and Ancestral Origins are especially important for participants to review.

The Haplogroup Origins tab shows a combination of academic research results identifying your haplogroup with locations, as well as some Ancestral Origins mixed in.

A Jewish Y DNA Haplogroup Origins page is shown above.

The Ancestral Origins page, below, reflects the location where your matches SAY their most distant direct matrilineal (for mtDNA) or patrilineal (for Y DNA) ancestors were found. Clearly, this information can be open to incorrect interpretation, and sometimes is. For example, people often don’t understand that “most distant maternal ancestor” means the direct line female on your mother’s mother’s mother’s side.  However, you’re not looking at any one entry. You are looking instead for trends.

The Ancestral Origins page for a Jewish man’s Y DNA is shown above.

The Haplogroup Origins page for Jewish mitochondrial DNA, below, looks much the same, with lots of Ashkenazi entries.

The mitochindrial Ancestral Origins results, below, generally become more granular and specific with the higher test levels. That’s because the more general results get weeded out a higher levels. Your closest matches at the highest level of testing are the most relevant to you, although sometimes people who tested at lower levels would be relevant, if they upgraded their tests.

Native American Y and Mitochondrial DNA

Native Americans, like Jewish people, are very fortunate in that they carry very specific sub-haplogroups for Y and mitochondrial DNA. The Native people had a very limited number of founders in the Americas when they originally arrived, between roughly 10,000 and 25,000 years ago, depending on which model you prefer to use. Descendants had no choice but to intermarry with each other for thousands of years before European and African contact brought new genes to the Native people.

Fortunately, because Y and mtDNA don’t mix with the other parents’ DNA, no matter how admixed the individual today, testers’ Y and mtDNA still shows exactly the origins of that lineage.

Native American Y DNA shows up as such on the Haplogroup Origins and Ancestral Origins tabs, as illustrated below.

The haplogroup assigned is shown along with a designation as Native on the Haplogroup Origins and Ancestral Origins pages. The haplogroup is assigned through DNA testing, but the Native designation and location is entered by the tester. Do be aware that some people record the fact that their “mother’s side” or “father’s side” is reported to have a Native ancestor, which is not (necessarily) the same as the matrilineal or patrilineal line. Their “mother’s side” and “father’s side” can have any number of both male and female ancestors.

If the tester’s haplogroup comes back as non-Native, the erroneous Native designation shows up in their matches Ancestral Origins page as “Native,” because that is what the tester initially entered.  I wrote about this situation here, but there isn’t much that can be done about this unless the tester either realizes their error or thinks to go back and change their designation from Native American when they realize the DNA does not support the family story, at least not on this particular line line. Erroneous labeling applies to both Y and mtDNA.

Native Y DNA falls within a subset of haplogroups C and Q. However, most subgroups of C and Q are NOT Native, but are European or Asian or in one case, a subgroup of haplogroup Q is Jewish. This does NOT means that the Jewish people and the Native people are related within many thousands of years. It means they had a common ancestor in Asia thousands of years ago that gave birth to both groups. In essence, one group of the original Q moved east and eventually into the Americas, and one moved west, winding up in Europe. Today, mutations (SNPs) have accrued to each group that very successfully differentiate them from one another. In order to determine whether your branch of C or Q is Native, you must take additional SNP tests which further identify your haplogroup – meaning which branch of haplogroup C or Q that you belong to.

Native Americans Y-DNA, to date, must fall into a subset of haplogroup C-P39, a subgroup of C-M217 or Q-M3, Q-M971/Z780 or possibly Q-B143 (ancient Saqquq in Greenland), according to The study of human Y chromosome variation through ancient DNA. Each of these branches also has sub-branches except for Q-B143 which may be extinct. This isn’t to say additional haplogroups or sub-haplogroups won’t be discovered in the future. In fact, haplogroup O is a very good candidate, but enough evidence doesn’t yet exist today to definitively state that haplogroup O is also Native.

STR marker testing, meaning panels of markers from 12-111, provides all participants with a major haplogroup estimate, such as C or Q. However, to confirm the Y DNA haplogroup subgroup further down the tree, one must take additional SNP testing. I wrote an article about the differences between STR markers and SNPs, if you’d like to read it, here and why you might want to SNP test, here.

Testers can purchase individual SNPs, such as the proven Native SNPs, which will prove or disprove Native ancestry, a panel of SNPs which have been combined to be cost efficient (for most haplogroups), or the Big Y test which scans the entire Y chromosome and provides additional matching.

When financially possible, the Big Y is always recommended. The Big Y results for the Sioux man showed 61 previously unknown SNPs. The Big Y test is a test of discovery, and is how we learn about new branches of the Y haplotree. You can see the most current version of the haplogroup C and Q trees on your Family Tree DNA results page or on the ISOGG tree.

Native mitochondrial DNA can be determined by full sequence testing the mitochondrial DNA. The mtPlus test only tests a smaller subset of the mtDNA and assigns a base haplogroup such as A. To confirm Native ancestry, one needs to take the full sequence mitochondrial test to obtain their full haplogroup designation which can only be determined by testing the full mitochondrial sequence.

Native mitochondrial haplogroups fall into base haplogroups A, B, C, D, X and M, with F as a possibility. The most recent paper on Native Mitochondrial DNA Discoveries can be found here and a site containing all known Native American mitochondrial DNA haplogroups is here.

Not Native or Jewish

Unfortunately, other endogamous groups aren’t as fortunate as Jewish and Native people, because they don’t have haplogroups or subgroups associated with their endogamy group. However, that doesn’t mean there aren’t a few other tools that can be useful.

Don’t forget about your Matches Maps. While your haplogroup may not be specific enough to identify your heritage, your matches may hold clues. Each individual tester is encouraged to enter the identity of their most distant ancestor in both their Y (if male) and mtDNA lines. Additionally, on the bottom of the Matches Map, testers can enter the location where that most distant ancestor is found. If you haven’t done that yet, this is a good time to do that too!

When looking at your Matches Map, clusters and distribution of your matches most distant ancestor locations are important.

This person’s matches, above, suggest that they might look at the history of Nova Scotia and French immigrants – and the history of Nova Scotia is synonymous with the Acadians but the waterway distribution can also signal French, but not Acadian. Native people are also associated with Nova Scotia and river travel. The person’s haplogroup would add to this story and focus on or eliminate some options.

This second example above, suggests the person look to the history of Norway and Sweden, although their ancestor, indicated by the white balloon, is from Germany. If the tester’s genealogy is stuck in the US, this grouping could be a significant clue relative to either recent or deeper history. Do they live in a region where Scandinavian people settled? What history connects the region where the ancestor is found with Scandinavia?

This third example, above, strongly suggests Acadian, given the matches restricted to Nova Scotia, and, as it turns out, this individual does have strong Acadian heritage. Again, their haplogroup is additionally informative and points directly to the European or Native side of the Acadian heritage for this particular line.

In Summary

Sometimes endogamy is up front and in your face, evident from the minute your DNA results are returned. Other times, endogamous lines in ethnically mixed individuals reveal themselves more subtly, like with my friend Justin. Fortunately, the different types of DNA tests and the different tools at our disposal each contain the potential for a different puzzle piece to be revealed. Many times, our DNA results need to be interpreted with some amount of historical context to reveal the story of our ancestors.

When I first discovered that my mother’s line was Acadian, my newly found cousin said to me, “If you’re related to one Acadian, you’re related to all Acadians.” He wasn’t kidding. For that very reason, endogamous genetic genealogy is tricky at best and frustrating at worst.

When possible, Y and mtDNA is the most definitive answer, because the centuries or millennia or intermarriage don’t affect Y and mtDNA. If you are Jewish or Native on the appropriate lines for testing, Y and mtDNA is very definitive. If you’re not Jewish or Native on your Y or mtDNA lines, check your matches for clues, including surnames, Haplogroup and Ancestral Origins, and your Matches Map.

Consider building a DNA pedigree chart that documents each of your ancestors’ Y and mtDNA for lines that aren’t revealed in your own test. The story of Y and mtDNA is not confused or watered down by admixture and is one of the most powerful, and overlooked, tools in the genealogist’s toolbox.

Autosomal DNA when dealing with endogamy can be quite challenging, even when working with well-documented Acadian genealogy – because you truly are related to everyone.  Trying to figure out which DNA segments go with, or descend from, which ancestors reaching back several generations is the ultimate jigsaw puzzle. Often, I work with a specific segment and see how far back I can track that segment in the ancestral line of me and my matches. On good days, we arrive at one common ancestor. On other days, we arrive at dead ends that are not a common ancestor – which means of course that we keep searching genealogically – or pick a different segment to work with.

When working with autosomal DNA of endogamous individuals (or endogamous lines of partially endogamous individuals,) I generally use a larger matching threshold than with non-endogamous, because we already know that these people will have segments that match because they descend from the same populations. In general, I ignore anything below 10cM and often below 15cM if I’m looking for a genealogical connection in the past few generations. If I’m simply mapping DNA to ancestors, then I use the smaller segments, down to either 7 or 5cM. If you want to read more about segments that are identical by chance (also known as false matches,) identical by population and identical by descent (genealogically relevant matches,) click here.

The good news about endogamy is that its evidence persists in the DNA of the population, literally almost forever, as long as that “population” exists in descendants – meaning you can find it!  In my case, my Acadian brick wall would have fallen much sooner had I know what endogamy looked like and what I was seeing actually meant.

A perfect example of persistent endogamy is that our Sioux male today, along with other nearly fully Native people, including people from South America, matches the ancient DNA of the Anzick child who died and was buried in Montana 12,500 years ago.

These people don’t just match on small segments, but at contemporary matching levels at Family Tree DNA and GedMatch, both.  One individual shows a match of 109 total cM and a single largest segment of DNA at 20.7 cM, a match that would indicate a contemporary relationship of between 3.5 and 4 generations distant – meaning 2nd to 3rd cousins. Clearly, that isn’t possible, but the DNA shared by Anzick Child and that individual today has been intact in the Native population for more than 12,500 years.

The DNA that Anzick Child carried is the same DNA that the Sioux people carry today – because there was no DNA from outside the founder population, no DNA to wash out the DNA carried by Anzick Child’s ancestors – the same exact ancestors of the Sioux and other Native or Native admixed people today.

While endogamy can sometimes be frustrating, the great news is that you will have found an entire population of relatives, a new “clan,” so to speak.  You’ll understand a lot more about your family history and you’ll have lots of new cousins!

Endogamy is both the blessing and the curse of genetic genealogy!

Concepts – Calculating Ethnicity Percentages

There has been a lot of discussion about ethnicity percentages within the genetic genealogy community recently, probably because of the number of people who have recently purchased DNA tests to discover “who they are.”

Testers want to know specifically if ethnicity percentages are right or wrong, and what those percentages should be. The next question, of course, is which vendor is the most accurate.

Up front, let me say that “your mileage may vary.” The vendor that is the most accurate for my German ancestry may not be the same vendor that is the most accurate for the British Isles or Native American. The vendor that is the most accurate overall for me may not be the most accurate for you. And the vendor that is the most accurate for me today, may no longer be the most accurate when another vendor upgrades their software tomorrow. There is no universal “most accurate.”

But then again, how does one judge “most accurate?” Is it just a feeling, or based on your preconceived idea of your ethnicity? Is it based on the results of one particular ethnicity, or something else?

As a genealogist, you have a very powerful tool to use to figure out the percentages that your ethnicity SHOULD BE. You don’t have to rely totally on any vendor. What is that tool? Your genealogy research!

I’d like to walk you through the process of determining what your own ethnicity percentages should be, or at least should be close to, barring any surprises.

By surprises, in this case, we’re assuming that all 64 of your GGGG-grandparents really ARE your GGGG-grandparents, or at least haven’t been proven otherwise. Even if one or two aren’t, that really only affects your results by 1.56% each. In the greater scheme of things, that’s trivial unless it’s that minority ancestor you’re desperately seeking.

A Little Math

First, let’s do a little very basic math. I promise, just a little. And it really is easy. In fact, I’ll just do it for you!

You have 64 great-great-great-great-grandparents.

Generation # You Have Who Approximate Percentage of Their DNA That You Have Today
1 You 100%
1 2 Parents 50%
2 4 Grandparents 25%
3 8 Great-grandparents 12.5%
4 16 Great-great-grandparents 6.25%
5 32 Great-great-great-grandparents 3.12%
6 64 Great-great-great-great-grandparents 1.56%

Each of those GGGG-grandparents contributed 1.56% of your DNA, roughly.

Why 1.56%?

Because 100% of your DNA divided by 64 GGGG-grandparents equals 1.56% of each of those GGGG-grandparents. That means you have roughly 1.56% of each of those GGGG-grandparents running in your veins.

OK, but why “roughly?”

We all know that we inherit 50% of each of our parents’ DNA.

So that means we receive half of the DNA of each ancestor that each parent received, right?

Well, um…no, not exactly.

Ancestral DNA isn’t divided exactly in half, by the “one for you and one for me” methodology. In fact, DNA is inherited in chunks, and often you receive all of a chunk of DNA from that parent, or none of it. Seldom do you receive exactly half of a chunk, or ancestral segment – but half is the AVERAGE.

Because we can’t tell exactly how much of any ancestor’s DNA we actually do receive, we have to use the average number, knowing full well we could have more than our 1.56% allocation of that particular ancestor’s DNA, or none that is discernable at current testing thresholds.

Furthermore, if that 1.56% is our elusive Native ancestor, but current technology can’t identify that ancestor’s DNA as Native, then our Native heritage melds into another category. That ancestor is still there, but we just can’t “see” them today.

So, the best we can do is to use the 1.56% number and know that it’s close. In other words, you’re not going to find that you carry 25% of a particular ancestor’s DNA that you’re supposed to carry 1.56% for. But you might have 3%, half of a percent, or none.

Your Pedigree Chart

To calculate your expected ethnicity percentages, you’ll want to work with a pedigree chart showing your 64 GGGG-grandparents. If you haven’t identified all 64 of your GGGG-grandparents – that’s alright – we can accommodate that. Work with what you do have – but accuracy about the ancestors you have identified is important.

I use RootsMagic, and in the RootsMagic software, I can display all 64 GGGG-grandparents by selecting all 4 of my grandparents one at a time.

In the first screen, below, my paternal grandfather is blue and my 16 GGGG-grandparents that are his ancestors are showing to the far right.  Please note that you can click on any of the images to enlarge.

ethnicity-pedigree

Next, my paternal grandmother

ethnicity-pedigree-1

Next, my maternal grandmother.

ethnicity-pedigree-2

And finally, my maternal grandfather.

ethnicity-pedigre-3

These displays are what you will work from to create your ethnicity table or chart.

Your Ethnicity Table

I simply displayed each of these 16 GGGG-grandparents and completed the following grid. I used a spreadsheet, but you can use a table or simply do this on a tablet of paper. Technology not required.

You’ll want 5 columns, as shown below.

  • Number 1-64, to make sure you don’t omit anyone
  • Name
  • Birth Location
  • 1.56% Source – meaning where in the world did the 1.56% of the DNA you received from them come from? This may not be the same as their birth location. For example an Irish man born in Virginia counts as an Irish man.
  • Ancestry – meaning if you don’t know positively where that ancestor is from, what do you know about them? For example, you might know that their father was German, but uncertain about the mother’s nationality.

My ethnicity table is shown below.

ethnicity-table

In some cases, I had to make decisions.

For example, I know that Daniel Miller’s father was a German immigrant, documented and proven. The family did not speak English. They were Brethren, a German religious sect that intermarried with other Brethren.  Marriage outside the church meant dismissal – so your children would not have been Brethren. Therefore, it would be extremely unlikely, based on both the language barrier and the Brethren religious customs for Daniel’s mother, Magdalena, to be anything other than German – plus, their children were Brethren..

We know that most people married people within their own group – partly because that is who they were exposed to, but also based on cultural norms and pressures. When it comes to immigrants and language, you married someone you could communicate with.

Filling in blanks another way, a local German man was likely the father of Eva Barbara Haering’s illegitmate child, born to Eva Barbara in her home village in Germany.

Obviously, there were exceptions, but they were just that, the exception. You’ll have to evaluate each of your 64 GGGG-grandparents individually.

Calculating Percentages

Next, we’re going to group locations together.

For example, I had a total of one plus that was British Isles. Three and a half, plus, that were Scottish. Nine and a half that were Dutch.

ethnicity-summary

You can’t do anything with the “plus” designation, but you can multiply by everything else.

So, for Scottish, 3 and a half (3.5) times 1.56% equals 5.46% total Scottish DNA. Follow this same procedure for every category you’re showing.

Do the same for “uncertain.”

Incorporating History

In my case, because all of my uncertain lines are on my father’s colonial side, and I do know locations and something about their spouses and/or the population found in the areas where each ancestor is located, I am making an “educated speculation” that these individuals are from the British Isles. These families didn’t speak German, or French, or have French or German, Dutch or Scandinavian surnames. People married others like themselves, in their communities and churches.

I want to be very clear about this. It’s not a SWAG (serious wild-a** guess), it’s educated speculation based on the history I do know.

I would suggest that there is a difference between “uncertain” and “unknown origin.” Unknown origin connotates that there is some evidence that the individual is NOT from the same background as their spouse, or they are from a highly mixed region, but we don’t know.

In my case, this leaves a total of 2 and a half that are of unknown origin, based on the other “half” that isn’t known of some lineages. For example, I know there are other Native lines and at least one African line, but I don’t know what percentage of which ancestor how far back. I can’t pinpoint the exact generation in which that lineage was “full” and not admixed.

I have multiple Native lines in my mother’s side in the Acadian population, but they are further back than 6 generations and the population is endogamous – so those ancestors sometimes appear more than once and in multiple Acadian lines – meaning I probably carry more of their DNA than I otherwise would. These situations are difficult to calculate mathematically, so just keep them in mind.

Given the circumstances based on what I do know, the 3.9% unknown origin is probably about right, and in this case, the unknown origin is likely at least part Native and/or African and probably some of each.

ethnicity-summary-2

The Testing Companies

It’s very difficult to compare apples to apples between testing companies, because they display and calculate ethnicity categories differently.

For example, Family Tree DNA’s regions are fairly succinct, with some overlap between regions, shown below.

ethnicity-ftdna-map

Some of Ancestry’s regions overlap by almost 100%, meaning that any area in a region could actually be a part of another region.

ethnicity-ancestry-map-2

For example look at the United Kingdom and Ireland. The United Kingdom region overlaps significantly into Europe.

ethnicity-ancestry-map

Here’s the Great Britain region close up, below, which is shown differently from the map above. The Great Britain region actually overlaps almost the entire western half of Europe.

ethnicity-ancestry-great-britain

That’s called hedging your bets, or maybe it’s simply the nature of ethnicity. Granted, the overlaps are a methodology for the vendor not to be “wrong,” but people and populations did and do migrate, and the British Isles was somewhat of a destination location.

This Germanic Tribes map, also from Ancestry’s Great Britain section, illustrates why ethnicity calculations are so difficult, especially in Europe and the British Isles.

ethnicity-invaders

Invaders and migrating groups brought their DNA.  Even if the invaders eventually left, their DNA often became resident in the host population.

The 23andMe map, below, is less detailed in terms of viewing how regions overlap.

ethnicity-23andme-map

The Genographic project breaks ethnicity down into 9 world regions which they indicate reflect both recent influences and ancient genetics dating from 500 to 10,000 years ago. I fall into 3 regions, shown by the shadowy Circles on the map, below.

ethnicity-geno-map-2

The following explanation is provided by the Genographic Project for how they calculate and explain the various regions, based on early European history.

ethnicity-geno-regions

Let’s look at how the vendors divide ethnicity and see what kind of comparisons we can make utilizing the ethnicity table we created that represents our known genealogy.

Family Tree DNA

MyOrigins results at Family Tree DNA show my ethnicity as:

ethnicity-ftdna-percents

I’ve reworked my ethnicity totals format to accommodate the vendor regions, creating the Ethnicity Totals Table, below. The “Genealogy %” column is the expected percentage based on my genealogy calculations. I have kept the “British Isles Inferred” percentage separate since it is the most speculative.

ethnicity-ftdna-table

I grouped the regions so that we can obtain a somewhat apples-to-apples comparison between vendor results, although that is clearly challenging based on the different vendor interpretations of the various regions.

Note the Scandinavian, which could potentially be a Viking remnant, but there would have had to be a whole boatload of Vikings, pardon the pun, or Viking is deeply inbedded in several population groups.

Ancestry

Ancestry reports my ethnicity as:

ethnicity-ancestry-amounts

Ancestry introduces Italy and Greece, which is news to me. However, if you remember, Ancestry’s Great Britain ethnicity circle reaches all the way down to include the top of Italy.

ethnicity-ancestry-table

Of all my expected genealogy regions, the most definitive are my Dutch, French and German. Many are recent immigrants from my mother’s side, removing any ambiguity about where they came from. There is very little speculation in this group, with the exception of one illegitimate German birth and two inferred German mothers.

23andMe

23andMe allows customers to change their ethnicity view along a range from speculative to conservative.

ethnicity-23andme-levels

Generally, genealogists utilize the speculative view, which provides the greatest regional variety and breakdown. The conservative view, in general, simply rolls the detail into larger regions and assigns a higher percentage to unknown.

I am showing the speculative view, below.

ethnicity-23andme-amounts

Adding the 23andMe column to my Ethnicity Totals Table, we show the following.

ethnicity-23andme-table-2

Genographic Project 2.0

I also tested through the Genographic project. Their results are much more general in nature.

ethnicity-geno-amounts

The Genographic Project results do not fit well with the others in terms of categorization. In order to include the Genographic ethnicity numbers, I’ve had to add the totals for several of the other groups together, in the gray bands below.

ethnicity-geno-table-2

Genographic Project results are the least like the others, and the most difficult to quantify relative to expected amounts of genealogy. Genealogically, they are certainly the least useful, although genealogy is not and never has been the Genographic focus.

I initially omitted this test from this article, but decided to include it for general interest. These four tests clearly illustrate the wide spectrum of results that a consumer can expect to receive relative to ethnicity.

What’s the Point?

Are you looking at the range of my expected ethnicity versus my ethnicity estimates from the these four entities and asking yourself, “what’s the point?”

That IS the point. These are all proprietary estimates for the same person – and look at the differences – especially compared to what we do know about my genealogy.

This exercise demonstrates how widely estimates can vary when compared against a relatively solid genealogy, especially on my mother’s side – and against other vendors. Not everyone has the benefit of having worked on their genealogy as long as I have. And no, in case you’re wondering, the genealogy is not wrong. Where there is doubt, I have reflected that in my expected ethnicity.

Here are the points I’d like to make about ethnicity estimates.

  • Ethnicity estimates are interesting and alluring.
  • Ethnicity estimates are highly entertaining.
  • Don’t marry them. They’re not dependable.
  • Create and utilize your ethnicity chart based on your known, proven genealogy which will provide a compass for unknown genealogy. For example, my German and Dutch lines are proven unquestionably, which means those percentages are firm and should match up relatively well to vendor ethnicity estimates for those regions.
  • Take all ethnicity estimates with a grain of salt.
  • Sometimes the shaker of salt.
  • Sometimes the entire lick of salt.
  • Ethnicity estimates make great cocktail party conversation.
  • If the results don’t make sense based on your known genealogical percentages, especially if your genealogy is well-researched and documented, understand the possibilities of why and when a healthy dose of skepticism is prudent. For example, if your DNA from a particular region exceeds the total of both of your parents for that region, something is amiss someplace – which is NOT to suggest that you are not your parents’ child.  If you’re not the child of one or both parents, assuming they have DNA tested, you won’t need ethnicity results to prove or even suggest that.
  • Ethnicity estimates are not facts beyond very high percentages, 25% and above. At that level, the ethnicity does exist, but the percentage may be in error.
  • Ethnicity estimates are generally accurate to the continent level, although not always at low levels. Note weasel word, “generally.”
  • We should all enjoy the results and utilize these estimates for their hints and clues.  For example, if you are an adoptee and you are 25% African, it’s likely that one of your grandparents was Africa, or two of your grandparents were roughly half African, or all four of your grandparents were one-fourth African.  Hints and clues, not gospel and not cast in concrete. Maybe cast in warm Jello.
  • Ethnicity estimates showing larger percentages probably hold a pearl of truth, but how big the pearl and the quality of the pearl is open for debate. The size and value of the pearl is directly related to the size of the percentage and the reference populations.
  • Unexpected results are perplexing. In the case of my unknown 8% to 12% Scandinavian – the Vikings may be to blame, or the reference populations, which are current populations, not historical populations – or some of each. My Scandinavian amounts translate into between 5 and 8 of my GGGG-grandparents being fully Scandinavian – and that’s extremely unlikely in the middle of Virginia in the 1700s.
  • There can be fairly large slices of completely unexplained ethnicity. For example, Scandinavia at 8-12% and even more perplexing, Italy and Greece. All I can say is that there must have been an awful lot of Vikings buried in the DNA of those other populations. But enough to aggregate, cumulatively, to between a great-grandparent at 12.5% and a great-great-grandparent at 6.25%? I’m not convinced. However, all three vendors found some Scandinavian – so something is afoot. Did they all use the same reference population data for Scandinavian? For the time being, the Scandinavian results remain a mystery.
  • There is no way to tell what is real and what is not. Meaning, do I really have some ancient Italian/Greek and more recent Scandinavian, or is this deep ancestry or a reference population issue? And can the lack of my proven Native and African ancestry be attributed to the same?
  • Proven ancestors beyond 6 generations, meaning Native lineages, disappear while undocumentable and tenuous ancestors beyond 6 generations appear – apparently, en masse. In my case, kind of like a naughty Scandinavian ancestral flash mob, taunting and tormenting me. Who are those people??? Are they real?
  • If the known/proven ethnicity percentages from Germany, Netherlands and France can be highly erroneous, what does that imply about the rest of the results? Especially within Europe? The accuracy issue is especially pronounced looking at the wide ranges of British Isles between vendors, versus my expected percentage, which is even higher, although the inferred British Isles could be partly erroneous – but not on this magnitude. Apparently part of by British Isles ancestry is being categorized as either or both Scandinavian or European.
  • Conversely, these estimates can and do miss positively genealogically proven minority ethnicity. By minority, I mean minority to the tester. In my case, African and Native that is proven in multiple lines – and not just by paper genealogy, but by Y and mtDNA haplogroups as well.
  • Vendors’ products and their estimates will change with time as this field matures and reference populations improve.
  • Some results may reflect the ancient history of the entire population, as indicated by the Genographic Project. In other words, if the entire German population is 30% Mediterranean, then your ancestors who descend from that population can be expected to be 30% Mediterranean too. Except I don’t show enough Mediterranean ancestry to be 30% of my German DNA, which would be about 8% – at least not as reported by any vendor other than the Genographic Project.
  • Not all vendors display below 1% where traces of minority admixture are sometimes found. If it’s hard to tell if 8-12% Scandinavian is real, it’s almost impossible to tell whether less than 1% of anything is real.  Having said that, I’d still like to see my trace amounts, especially at a continental level which tends to be more reliable, given that is where both my Native and African are found.
  • If the reason my Native and African ancestors aren’t showing is because their DNA was not passed on in subsequent generations, causing their DNA to effectively “wash out,” why didn’t that happen to Scandinavian?
  • Ethnicity estimates can never disprove that an ancestor a few generations back was or was not any particular ethnicity. (However, Y and mitochondrial DNA testing can.)
  • Absence of evidence is not evidence of absence, except in very recent generations – like 2 (grandparents at 25%), maybe 3 generations (great-grandparents at 12.5%).
  • Continental level estimates above 10-12 percent can probably be relied upon to suggest that the particular continental level ethnicity is present, but the percentage may not be accurate. Note the weasel wording here – “probably” – it’s here on purpose. Refer to Scandinavia, above – although that’s regional, not continental, but it’s a great example. My proven Native/African is nearly elusive and my mystery Scandinavian/Greek/Italian is present in far greater percentages than it should be, based upon proven genealogy.
  • Vendors, all vendors, struggle to separate ethnicity regions within continents, in particular, within Europe.
  • Don’t take your ethnicity results too seriously and don’t be trading in your lederhosen for kilts, or vice versa – especially not based on intra-continental results.
  • Don’t change your perception of who you are based on current ethnicity tests. Otherwise you’re going to feel like a chameleon if you test at multiple vendors.
  • Ethnicity estimates are not a short cut to or a replacement for discovering who you are based on sound genealogical research.
  • No vendor, NOT ANY VENDOR, can identify your Native American tribe. If they say or imply they can, RUN, with your money. Native DNA is more alike than different. Just because a vendor compares you to an individual from a particular tribe, and part of your DNA matches, does NOT mean your ancestors were members of or affiliated with that tribe. These three major vendors plus the Genographic Project don’t try to pull any of those shenanigans, but others do.
  • Genetic genealogy and specifically, ethnicity, is still a new field, a frontier.
  • Ethnicity estimates are not yet a mature technology as is aptly illustrated by the differences between vendors.
  • Ethnicity estimates are that. ESTIMATES.

If you like to learn more about ethnicity estimates and how they are calculated, you might want to read this article, Ethnicity Testing, A Conundrum.

Summary

This information is NOT a criticism of the vendors. Instead, this is a cautionary tale about correctly setting expectations for consumers who want to understand and interpret their results – and about how to use your own genealogy research to do so.

Not a day passes that I don’t receive very specific questions about the interpretation of ethnicity estimates. People want to know why their results are not what they expected, or why they have more of a particular geographic region listed than their two parents combined. Great questions!

This phenomenon is only going to increase with the popularity of DNA testing and the number of people who test to discover their identity as a result of highly visible ad campaigns.

So let me be very clear. No one can provide a specific interpretation. All we can do is explain how ethnicity estimates work – and that these results are estimates created utilizing different reference populations and proprietary software by each vendor.

Whether the results match each other or customer expectations, or not, these vendors are legitimate, as are the GedMatch ethnicity tools. Other vendors may be less so, and some are outright unethical, looking to exploit the unwary consumer, especially those looking for Native American heritage. If you’re interested in how to tell the difference between legitimate genetic information and a company utilizing pseudo-genetics to part you from your money, click here for a lecture by Dr. Jennifer Raff, especially about minutes 48-50.

Buyer beware, both in terms of purchasing DNA testing for ethnicity purposes to discover “who you are” and when internalizing and interpreting results.

The science just isn’t there yet for answers at the level most people seek.

My advice, in a nutshell: Stay with legitimate vendors. Enjoy your ethnicity results, but don’t take them too seriously without corroborating traditional genealogical evidence!

Concepts – Undocumented Adoptions vs Untested Y Lines

So you took the Y-line test and you don’t match the surnames you expected to match and now you’re worried. Is there maybe an “oops” in your lineage?

One of two things has happened. Either your line has simply not tested or you have an undocumented adoption in your line.

An undocumented adoption is any “adoption” at any time in history that is not documented – so if you didn’t know about it, it’s an undocumented adoption. Often, these events in genetic genealogy are referred to as NPEs, Non-Paternal Events, but I prefer undocumented adoptions.

Yes, there are myriad ways for this to happen, and I mean besides the obvious infidelity situation, but right now, you only care about figuring out IF you have an undocumented adoption, not how it happened.

How can you tell if your line is one that simply hasn’t been tested of if there is an undocumented adoption in your line? Sometimes you can’t, you’ll simply have to wait until more people of your surname test. Of course, you can always recruit people through the Rootsweb and Genforum lists and boards and social media.

Most of the time this is a process of elimination. If you can’t find anything to suggest that you have an undocumented adoption, then your line is simply probably untested, especially if it’s not a common surname or your ancestors had few male children.

However, there are often clues lurking relative to undocumented adoptions.

Scenario 1 – Right Family, Non-Matching DNA

If you are part of DNA surname project and there are other people who have tested, that you don’t match, that claim the same ancestor as you do – you might have an undocumented adoption on your hands.

In this case, someone’s genealogy is wrong, yours or theirs. By wrong, that doesn’t mean you made a mistake. You (or they) may have tracked the line back to the right ancestor, but instead of being the child of a son of John Doe, for example, your ancestor was the child of the daughter of John Doe, who wasn’t married at the time and had a child by a Smith, but gave the child her surname, Doe.

undoc-1

So right Doe family, wrong child giving birth. There are also other family situations that are discovered utilizing Y DNA testing, like a child simply using the step-father’s name. In this case, finding more descendants to test, especially through other sons will help resolve the paternity question. Given the scenario above, we really don’t know whether the green or red DNA is the Y DNA of John Doe. We need the DNA of another son to resolve the question.

Scenario 2 – Accurate Genealogy, Undocumented Adoption

If you are part of a DNA surname project and two other people who descend from two separate sons of the same ancestor you claim, both having good solid genealogy back to that ancestor – you do have an undocumented adoption on your hands. This situation pretty much removes any doubt about your ancestral line if you are Steve, below.

undoc-2

Assuming their genealogy is correct (and yes, the genealogy could be wrong), theirs (the green) is the paternal line from that ancestor, so you need to start looking at situations that might lend themselves to your ancestor having that name but not sharing that paternal genetic line.

The break in the ancestral line can have occurred anyplace between John Doe and son Steve and the tester, Steve V.  You might want to test males descended from men between Steve Doe and Steve Doe V.  Word of warning here – if you don’t want to know the answer, don’t test.  The break could be between you and your father or your father and grandfather.  Sometimes, these possibilities are just too close for comfort.

At this point, I would turn to autosomal testing to see if any of the people in the surname project match you autosomally. That may tell you if you are actually descended from this line at all – perhaps through a female child as described above. With autosomal testing, especially of distant relatives, you can prove a positive, that you are related, but you can’t really prove a negative, that you aren’t related.

If you’re testing second cousins or closer, you can prove a negative.  If you don’t match your full second cousins, there is a problem – and it’s not the genealogy.

Scenario 3 – Matching a Group of Men with a Particular Surname

If you match a significant number of men with other surnames, with one surname in particular being closely matched and quite prevalent, it’s a large hint. For example, let’s say you have 6 matches at your highest marker level, and 5 of them are Miller men descended from the same ancestor. Chances are very good that you are of Miller descent too.

Again, I’d turn to autosomal testing at this point to see how closely you are related to your closest matching Y DNA Millers or others descended from this same ancestral line.

undoc-3

Scenario 4 – Your Line is Untested

If your surname is something quite unusual, like Ferverda for example, and you don’t fit the situations described above, then it’s likely that your line simply hasn’t tested yet. In this case, the grandfather of our tester was the immigrant from the Netherlands, and Ferverda, both there and in the US, is a very unusual name.

undoc-4

Of course, your line having not tested can happen with common surnames too.

Utilizing Y Search

Check www.ysearch.org periodically to see if others of your surname took the Y chromosome test elsewhere and just got around to entering the results into YSearch, even though the other testers (Ancestry, Sorenson) have been defunct for some time now relative to Y DNA.

undoc-5

You can also search at YSearch by surname. You don’t have any way to view results by surname, outside of projects, at Family Tree DNA, so the only way to discover that someone who claims your paternal line and doesn’t match you is to search by surname at YSearch and hope they have included a tree.

undoc-6

In this example, one person with the Estes surname has results at YSearch, but 40 have Estes in their tree, just not as their patrilineal surname.

undoc-7

Keep in mind that depending on how far back in time an undocumented adoption occurred, you may find matches to people with that same surname who descend from your common biological ancestor, but you may still not share the original ancestor. In the example above, the Doe men red all match each other, because their unknown Smith ancestor is the same, but they don’t match the descendant of John Doe through son James.

A non-match to men of your same surname isn’t a cause for panic, but it is time to do some additional digging to see if you can discover why.

Happy ancestor hunting!

Concepts – Why DNA Testing the Oldest Family Members is Critically Important

Recently, someone asked me to explain why testing the older, in fact, the oldest family members is so important. What they really wanted were talking points in order to explain to others, in just a few words, so that they could understand the reasoning without having to understand the details or the science.

Before I address that question, I want to talk briefly about how Y and mitochondrial DNA are different from autosomal DNA, because the answer to the “oldest ancestor” question is a bit different for those two types of tests versus autosomal DNA.

In the article, 4 Kinds of DNA for Genetic Genealogy, I explain the differences between Y and mitochondrial DNA testing, who can take each, and how they differ from autosomal DNA testing.

Y and Mitochondrial DNA

In the graphic below, you can see that the Y chromosome, represented by blue squares, is inherited only by males from direct patrilineal males in the male’s tree – meaning inherited from his father who inherited the Y chromosome from his father who inherited it from his father, on up the tree. Of course, along with the Y chromosome, generally, the males also inherited their surname.

Y and mito

Mitochondrial DNA, depicted as red circles, is inherited by both genders of children, but ONLY the females only pass it on. Mitochondrial DNA is inherited from your mother, who inherited it from her mother, who inherited it from her mother, on up the tree in the direct matrilineal path.

  • Neither Y or mitochondrial DNA is ever mixed with the DNA of the other parent, so it is never “lost” during inheritance. It is inherited completely and intact. This allows us to look back more reliably much further in time and obtain a direct, unobstructed, view of the history of the direct patrilineal or matrilineal line.
  • Changes between generations are caused by mutations, not by the DNA of the two parents being mixed together and by half being lost during inheritance.
  • This means that we test the oldest relevant ancestor in that line to be sure we have the “original” DNA and not results that have incurred a mutation, although generally, mutations are relatively easy to deal with for both Y and mitochondrial DNA since the balance of this type of DNA is still ancestral.

Testing the oldest generation is not quite as important in Y and mitochondrial DNA as it is for autosomal DNA, because most, if not all, of the Y and mitochondrial DNA will remain exactly the same between generations.  That is assuming, of course, that no unknown adoptions, known as Nonparental Events (NPEs) occurred between generations.

However, autosomal DNA is quite different. When utilizing autosomal DNA, every person inherits only half of their parents’ DNA, so half of their autosomal ancestral history is lost with the half of their parents’ DNA that they don’t inherit. For autosomal DNA, testing the oldest people in the family, and their siblings, is critically important.

Autosomal DNA

In the graphic below, you can see that the Y and mitochondrial DNA, still represented by a small blue chromosome and a red circle, respectively, is inherited from only one line.  The son received an entirely intact blue Y chromosome and both the son and daughter receive an entirely intact mitochondrial DNA circle.

Autosomal DNA, on the other hand, represented by the variously colored chromosomes assigned to the 8 great-grandparents on the top row, is inherited by the son and daughter, at the bottom, in an entirely different way.  The autosomal chromosomes inherited by the son and daughter have pieces of blue, yellow, green, pink, grey, tan, teal and red mixed in various proportions.

Autosomal path

In fact, you can see that in the first generation, the grandfather, for example, inherited both a pink and green chromosome from his mother, and a blue and yellow chromosome from his father, not to be confused with the smaller blue Y chromosome which is shown separately. The grandmother inherited a grey and tan chromosome from her father and a teal and red chromosome from her mother, again not to be confused with the red mitochondrial circle.

In the next generation, the father inherited parts of the pink, green, blue and yellow DNA. The mother inherited parts of the grey, tan, teal and red DNA.

The answer to part of the question of why it’s so important to test older generations is answered with this graphic.

  • The children inherit even smaller portions of their ancestor’s autosomal DNA than their parents inherited. In fact, in every generation, the child inherits half of the DNA of each parent. That means that the other half of the parents’ autosomal DNA is not inherited by the child, so in each generation, you lose half of the autosomal DNA from the previous generation, meaning half of your ancestors’ DNA.
  • Each child inherits half of their parents’ DNA, but not the same half. So different children from the same parents will carry a different part of their parents’ autosomal DNA, meaning a different part of their ancestors’ DNA.

The best way to understand the actual real-life ramifications of inheriting only half of your parent’s DNA is by way of example.

I have tested at Family Tree DNA and so has my mother. All of my mother’s DNA and matches are directly relevant to my genealogy and ancestry, because I share all of my mother’s ancestors. However, since I only inherited half of her DNA, she will have many matches to cousins that I don’t have, because she carries twice as much of our ancestor’s DNA than I do.

Mother’s Matches My Matches in Common With Mother Matches Lost Due to Inheritance

920

371

549

As you can see, I only share 371 of the matches that mother has, which means that I lost 549 matches because I didn’t inherit those segments of ancestral DNA from mother. Therefore, mother matches many people that I don’t.

That’s exactly why it’s so critically important to test the oldest generation.

It’s also important to test siblings. For example, your grandparent’s siblings, your parent’s siblings and your own siblings if your parents aren’t living. These people all share all of your ancestors.

I test my cousin’s siblings as well, if they are willing, because each child inherits a different half of their parent’s DNA, which is your ancestor’s DNA, so they will have matches to different people.

How important is it to test siblings, really?

Let’s take a look at this 4 generation example of matching and see just how many matches we lose in four generations. We begin with my mother’s 920 matches, as shown above, but let’s add two more generations beyond me.

4-gen-match-totals

As you can see in the above example, the two grandchildren inherited a different combination of their parent’s DNA, given that Grandchild 1 has 895 matches in common with one of their parents and Grandchild 2 has 1046 matches in common the same parent. Those matches aren’t to entirely the same set of people either – because the two siblings inherited different DNA segments from their parent. The difference in the number of matches and the difference in the people that the siblings match in common with their parent illustrates the difference that inheriting different parental DNA segments makes relative to genealogy and DNA matching.

However, if you look at the matching number in common with their grandparent and great-grandparent, the differences become even greater and the losses between generations become cumulative. Just think how many matches are really lost, given that in our illustration we are only comparing to one of two parents, one of four grandparents and one of 8 great-grandparents.

The really important numbers are the Lost Matches, shown in red. These are the matches that WOULD BE LOST FOREVER IF THE OLDER GENERATION(S) HAD NOT TESTED.

Note that the lost matches are much higher numbers than the matches.

Summary

In summary, here are the talking points about why it’s critically important to test the oldest members of each generation, and every generation between you and them.

Autosomal DNA:

  1. Every person inherits only half of their parents’ DNA, meaning that half of your ancestors’ DNA is lost in each generation – the half you don’t receive.
  2. Siblings each inherit half of their parents’ DNA, but not the same half, so each child has some of their ancestor’s DNA that another child won’t have.
  3. The older generations of direct line relatives and their siblings will match people that you don’t, and their matches are as relevant to your genealogy as your own matches, because you share all of the same ancestors.
  4. Being able to see that you match someone who also matches a known ancestor or cousin shows you immediately which ancestral line the match shares with you.
  5. Your cousins, even though they will have ancestral lines that aren’t yours, still carry parts of your ancestors’ DNA that you don’t, so it’s important to test cousins and their siblings too.

Y and mitochondrial DNA:

  1. Testing older generations allows you to be sure that you’re dealing with DNA results that are closer to, or the same as, your ancestor, without the possibility of mutations introduced in subsequent generations.
  2. In many cases, your cousins, father, grandfather, etc. will carry Y or mitochondrial DNA that you don’t, but that descends directly from one of your ancestors. Your only opportunity to obtain that information is to test lineally appropriate cousins or family members. This is particularly relevant for males such as fathers, grandfathers, paternal aunts and uncles who don’t pass on their mitochondrial DNA.

I wrote about creating your DNA pedigree chart for Y and mitochondrial DNA here.

Be sure to test the oldest generations autosomally, but also remember to review your cousins’ paths of descent from your common ancestors closely to determine if their Y or mitochondrial DNA is relevant to your genealogy! Y, mitochondrial and autosomal DNA are all different parts of unraveling the ancestor puzzle for each of your family lines.

You can order the Y, mitochondrial DNA and Family Finder tests from Family Tree DNA.

Happy ancestor hunting!

Concepts – Managing Autosomal DNA Matches – Step 2 – Updating Match Spreadsheets, Bucketed Family Finder Matches and Pileups

We’re going to do three things in this article.

  1. Updating Your DNA Master Spreadsheet With New Matches
  2. Labeling Known Pileup Areas
  3. Utilizing Phased Family Finder Matches

You must do item one above, before you can do item three…just in case you are thinking about taking a “shortcut” and jumping to three. Word to the wise. Don’t.

OK, let’s get started! I promise, after we get the housework done, you’ll have a LOT of fun! Well, fun for a genetic genealogist anyway!

Updating Your Chromosome Browser Spreadsheet

If you haven’t updated your chromosome browser spreadsheet at Family Tree DNA since you originally downloaded your matches, it’s time to do that. You need to do this update so that your DNA Master Spreadsheet is in sync with your current matches before you can add the Family Finder bucketed matches to your master spreadsheet. Just trust me on this and understand that I found out the hard way. You don’t have to traipse through that same mud puddle because I already did and I’m warning you not to.

Let’s get started updating our DNA Master spreadsheet with our latest matches.  It’s a multi-step process and you’re going to be working with three different files:

  • File 1 – Your DNA Master Matches spreadsheet that you have created. This is the file you will be updating with information from the other two files, below.
  • File 2 – A current download of all of your chromosome browser file matches.
  • File 3 – A current download of a list of your matches.

The steps you will take, are as follows:

  1. Download a new Chromosome Browser Spreadsheet, but DO NOT overwrite your existing DNA Master spreadsheet, or you’ll be swearing, guaranteed. This chromosome browser spreadsheet is downloaded from the Family Finder chromosome browser page. Label it with a date and save it as an Excel file.
  2. Download a new Matches spreadsheet. This spreadsheet is downloaded from the bottom of your matches page. Label it with the same date and save it as an Excel file too.
  3. Update your Master DNA Matches spreadsheet utilizing the instructions provided below.

If you need a refresher about downloading spreadsheet information from Family Tree DNA, click here.

Your Matches spreadsheet will include a column labeled “Match Date.”

concepts2-match-date

On your Matches spreadsheet, sort the Match Date column in reverse order (sort Z to A) and print the list of matches that occurred since your last update date – meaning the date you last updated your DNA Master Spreadsheet.

If you need a refresher about how to sort spreadsheets, click here.

concepts2-match-list

This list will be your “picklist” from the new chromosome browser match spreadsheet you downloaded. I removed the middle and last names the matches, above, to protect their privacy, but you’ll have their full name to work with.

After your spreadsheet is sorted by match date, with the most current date at the top, you’ll have a list of the most recent matches, meaning those that happened since your last file download/update. Remember, I told you to record on a secondary page in your DNA Master Spreadsheet the history of the file, including the date you do things? This is why.  You need to know when you last downloaded your matches so that you don’t duplicate existing matches in your spreadsheet.

Why don’t you just want to download a new spreadsheet and start over?

concepts2-headers

Remember the color coding and those pink columns we’ve been adding, at right, above, so you can indicate which side that match is from, if the segment is triangulated, how you are related, the most recent common ancestor, the ancestral line, and other notes? If you overwrite your current DNA Master spreadsheet, all of that research information will be gone and you’ll have to start over. So as inconvenient as it is, you’ll need to go to the trouble of adding only your new matches to your DNA Master spreadsheet and only add new matches.

Utilizing the new Chromosome Browser match spreadsheet, you are going to scroll down (or Ctl+F) and find the names of the people you want to add to your master spreadsheet. Those are the people on your Matches spreadsheet whose test date is since you last downloaded the chromosome browser information.

When you find the person’s name (Amy in this example) on the Chromosome Browser Match spreadsheet, highlight the cells and right click to copy the contents of those cells so that you can paste them at the bottom of your DNA Master Spreadsheet.

concepts2-pick-list

Next, open your Master DNA Spreadsheet, and right click to paste the cells at the very bottom of the spreadsheet, positioning the cursor in the first cell of the first row where you want to paste, shown below.

concepts2-paste

Then click on Paste to paste the cells.

concepts2-paste-position

Repeat this process for every new match, copy/pasting all of their information into your DNA Master Spreadsheet.  I try to remember to do this about once a month.

Housekeeping note – If you’re wondering why some graphics in this article are the spreadsheet itself, and some are pictures of my screen (taken with my handy iPhone,) like the example above, it’s because when you do a screen capture, the screen capture action removes the drop down box that I want you to see in the pictures above. Yes, I know these pictures aren’t wonderful – but they are sufficient for you to see what I’m doing and that’s the goal.

Combined Spreadsheets

In my case, if you recall, I have a combined master spreadsheet with my matches and my mother’s matches in one spreadsheet. You may have this same situation with parents and grandparents or your full siblings if your parents are missing.

You will need to repeat this process for each family member whose entire match list resides in your DNA Master spreadsheet.

I know, I groaned too. And just in case you’re wondering, I’ve commenced begging at Family Tree DNA for a download by date function – but apparently I did not commence begging soon enough, because as of the date of this article, it hasn’t happened yet – although I’m hopeful, very hopeful.

After your spreadsheet is updated, we have a short one-time housekeeping assignment, then we’ll move on to something much more fun.

Known Pileup Regions

I want you to add the following segments into your DNA Master spreadsheet. These are known pileup regions in the human genome, also known as excess IBD (identical by descent) regions. This means that you may well phase against your parents, but the match is not necessarily genealogical in nature, because many individuals match in these areas, by virtue of being human. Having said that, close relationships may match you in these regions. Hopefully they will also match you in other regions as well, because it’s very difficult to tell if matches in these regions are by virtue of descent genealogically or because so many people match in these regions by virtue of being human.

concepts2-known-pileup-regions

You can color code these rows in your spreadsheet so you will notice them.  If you do, be sure to use a color that you’re not using for something else.

I have used several sources for this information, including the ISOGG wiki phasing page and Sue Griffith’s great Genealogy Junkie blog article titled Chromosome Maps Showing Centromeres, Excess IBD Regions and HLA Region. The HLA region on chromosome 6 is the most pronounced. Tim Janzen states that he has seen as many as 2000 SNP segments in this region that are identical by population, or at least they do not appear to be identical by descent, meaning he cannot find the common ancestor. His personal HLA region boundaries are a bit larger too, from 25,000,000 to 35,000,000. Regardless of the exact boundaries that you use, be aware of this very “matchy” region when you are evaluating your matches.  This is exactly why you’re entering these into your DNA Master spreadsheet – so you don’t have to “remember.”.

By the way, Family Tree DNA and GedMatch use Build 36, but eventually they will move to Build 37 of the human genome, so you might as well enter this information now so it will be there when you need it. If your next question is about how that transition will be handled, the answer is that I don’t know, and we will deal with it at that time.

I do not enter the SNP poor regions, because Family Tree DNA does not utilize those regions at all, and they are the greyed out regions of your chromosome map, shown below.

concepts2-snp-poor-regions

On my own spreadsheet, I have a few other things too.

I have indicated chromosomal regions where I carry minority ancestry.  For both my mother and me, chromosome 2 has significant Native admixture.  This Native heritage is also confirmed by mitochondrial and Y DNA tests on relevant family members.

concepts2-native-segments

If you carry any Native American or other minority admixture, where minority is defined as not your majority ethnicity, as determined by any of the testing companies, you can utilize GedMatch ethnicity tools to isolate the segments where your specific admixture occurs. I described how to do this here as part of The Autosomal Me series. I would suggest that you use multiple tools and look for areas that consistently show with that same minority admixture in all or at least most of the tools. Note that some tools are focused towards a specific ethnicity and omit others, so avoid those tools if the ethnicity you seek is not in line with the goals of that specific tool.

Ok, now that our housekeeping is done, we can have fun.

Adding Phased Family Finder Matches to your Spreadsheet

I love the new Family Tree DNA phased Family Finder matches that assign maternal or paternal “sides” to matches based on your matches to either a parent or close relative. If you would like a refresher on parental phasing, click here.

We’re going to utilize that Match spreadsheet you just downloaded once again.

In this case, we’re going to do something a bit different.

This time, we’re going to sort by the last column, “Matching Bucket.” (Please note you can enlarge any image by clicking or double clicking on it.)

concepts2-match-bucket

When you’re done sorting the “Matching Bucket” column , you will have four groups of matches, as follows:

  • Both
  • Maternal
  • N/A
  • Paternal

I delete the N/A rows, which means “not applicable” – in other words, the match did not meet the criterial to be assigned to a “side.” You can read about the criteria for phased Family Finder matches here and here. If you don’t want to delete these rows, you can just ignore them.

The next thing I do is to add a column before the first column on the spreadsheet, so before “Full Name.”

In this case, you can highlight either the entire column or just the column heading, and right click to insert an entire column to the left.

concepts2-insert-column

If these are your matches, add your name in the “Who” column. If these are your parents’ or full siblings’ matches, add their names in this column. When you have a combined spreadsheet, it’s critical to know whose matches are whose.

Then select colors for the maternal, paternal and both buckets, and color the rows on your spreadsheet accordingly.

I use pink and blue, appropriately, but not exactly the same pink and blue I use for the mother and father spreadsheet rows in my DNA Master spreadsheet. I used a slightly darker pink and slightly darker blue so I can see the difference at a glance. The yellow, or gold in this case, indicates a match to both sides.

concepts2-bucket-colors

You’re only going to actually utilize the first two columns of information.

Highlight and copy the first two columns, without the header, as shown below.

concepts2-bucket-columns

Then open your master spreadsheet and paste this information at the very bottom of your spreadsheet in the first two columns.

concepts2-bucket-paste

After the paste, your spreadsheet will look like this.

concepts2-bucket-rows

Next, sort your spreadsheet by match name, this case, RVH is the match (white row).

concepts2-bucket-match-sort

Be still my heart. Look what happens. By color, you can see who matches you on which sides, for those who are assigned to parental buckets.  Now my white RVH match row is accompanied by a gold row as well telling me that RVH matches me on both my maternal and paternal sides.

Let’s look at another example. In the case of Cheryl, she is my mother’s first cousin. Since I have combined both my mother’s and my spreadsheets, you can see that Cheryl matches both me and my mother on chromosome 19 and 20 below. Mother’s match rows are pink and my rows are white.

concepts2-bucket-match-maternal

In this example, you can see that indeed, Cheryl is assigned on my maternal side by Family Tree DNA, based on the dark pink match row that we just added. Indeed, by looking at the spreadsheet itself, you can confirm that Cheryl is a match on my mother’s side. I am only showing chromosome 19 and 20 as examples, but we match on several different locations.

I don’t have as many paternal side matches, because my father is not in the system, but I do have several cousins to phase against.

concepts2-bucket-match-paternal

Here’s my cousin, Buster, assigned paternally, which is accurate. In Buster’s case, I already have him assigned on my Dad’s side, but if I hadn’t already made this assignment, I could make that with confidence now, based on Family Tree DNA’s assignment.  The blessing here is that the usefulness of Buster’s assignment paternally doesn’t end there, but his results, and mine, together will be used to assign other matches to buckets as well.  Cousin matching is the gift that keeps on giving.

Because my DNA Master spreadsheet includes my mother’s information as well, we need to add her phased Family Finder matches too.

Mother’s Family Finder Matches

Because I have my mother’s and my results combined into one DNA Master spreadsheet, I repeat the same process for my mother, except I type her name in the first column I added with the title of “Who.”

concepts2-mother

Continue with the same “Adding Phased Family Finder Matches” instructions above, and when you are finished, you will have a Master DNA Spreadsheet that includes your information, your parent’s information, and anyone who is phased for either of you maternally, paternally or to both sides will be noted in your spreadsheet by match and color coded as well.

Let’s take a look at cousin Cheryl’s matches to both mother and I on our spreadsheet now with our maternal and paternal buckets assigned.

concepts2-cheryl-to-mother

As you can see, my results are the white row, and my Family Finder phased matches indicate that Cheryl is a match on my mother’s side, which is accurate.

Looking at my mother, Barbara’s matches, the pink rows, and then at Barbara’s Family Finder phased match information, it shows us that Cheryl matches mother on the blue, or paternal side, which is also accurate, per the pedigree chart below.

Margaret Lentz chart

You can see that Barbara and Cheryl are in the same generation, first cousins, and Barbara matches Cheryl on her paternal line which is reflected in the Family Finder bucketing.

I have updated the “Side” column to reflect the Family Finder bucketing information, although in this case, I already had the sides assigned based on previous family knowledge.

concepts2-bucket-matching-blended

In this example of viewing my mother and my combined spreadsheet matches, you are seeing the following information:

  • Cheryl matches me – white rows
  • Cheryl matches me on my maternal side – dark pink row imported from Match spreadsheet
  • Cheryl matches mother (Barbara) – light pink rows
  • Cheryl matches mother on her father’s side – blue row imported from Match spreadsheet

I find this combined spreadsheet with the color coding very visual and easy to follow.  Better yet, when other people match mother, Cheryl and I on this same segment, they fall right into this grouping on my DNA Master spreadsheet, so the relationship is impossible to miss.  That’s the beauty of a combined spreadsheet.

You can do a combined spreadsheet with individuals whose DNA is “yours” and they don’t share DNA with anyone that you don’t. Those individuals would be:

  • Either or both parents
  • Grandparents
  • Aunts and Uncles
  • Full siblings
  • Great-aunts and great-uncles

Why not half siblings or half aunts-uncles? Those people have DNA from someone who is not your ancestor. In other words, your half siblings have the DNA from only one of your parents, and you don’t want their matches from their other parent in your spreadsheet. You only want matches that positively descend from your ancestors.

While your grandparents, great-aunts, great-uncles, parents, aunts and uncles will have matches that you don’t, those matches may be critically important to you, because they have DNA from your ancestors that you didn’t inherit. So your combined DNA Master spreadsheet represents your DNA and the DNA of your ancestors found in your relatives who descend directly ONLY from your ancestors. Those relatives have DNA from your ancestors that has washed out by the time it gets to you.

Why can’t your cousins be included in your DNA Master spreadsheet?

I want you to take a minute and think about the answer to this question.

Thinking…..thinking….thinking…. (can you hear the Jeopardy music?)

And the answer is….

If you answered, “Because my aunt or uncle married someone with whom they had children, so my cousins have DNA that is not from an ancestor of mine,” you would be exactly right!!!

The great news is that between a combined spreadsheet and the new Family Finder bucketed matches, you can determine a huge amount about your matches.

After discovering which matches are bucketed, you can then use the other tools at Family Tree DNA, like “in common with” to see who else matches you and your match. The difference between bucketing and ICW is that bucketing means that you match that person (and one of your proven relatives who has DNA tested) on the same segment(s) above the 9cM bucketing threshold.  You can still match on the same segments, but not be reported as a bucketed match because the segments fall below the threshold.  “In common with” means that you both match someone else, but not necessarily on the same segments.

Here’s a nice article about utilizing the 9 tools provided by Family Tree DNA for autosomal matching.

The Beauty of the Beast

The absolutely wonderful aspect of phased Family Finder Matching is that while you do need to know some third cousins or closer, and the more the better, who have DNA tested, you do NOT need access to their family information, their tree or the DNA of your matches. If your matches provide that information, that’s wonderful, but your DNA plus that of your known relatives linked to your tree is doing the heavy lifting for you.

How well does this really work? Let’s take a look and see.

On the chart below, I’ve “bucketed” my information (pardon the pun.) Keep in mind that I do have my mother’s autosomal DNA, but not my fathers. His side is represented by 8 more distant relatives, the closest of which are my half-sister’s granddaughter and my father’s brother’s granddaughter – both of which are the genetic equivalent of 1st cousins once removed. My mother’s side is represented by mother and two first cousins.

Total Matches Maternal Side Bucket Paternal Side Bucket Both Sides Bucket Percent Assigned
Mother 865 13 106 2 14
Me 1585 356 361 3 23

Mother has the above 106 paternal bucketed matches without me doing anything at all except linking the DNA tests of mother to her two first cousins in her tree.  In my case, the combination of mother’s DNA and her two first cousins generated 356 maternal side bucketed matches, just by linking mother and her two first cousins to my tree.

concepts2-tree

Mother does have one third cousin on her mother’s side who generated 13 maternal bucketed matches.  So, while third cousins are distant, they can be very useful in terms of bucketed “sides” to matches.

It’s ironic that even though I have my mother’s DNA tested, I have slightly more paternal matches, without my father, than maternal matches, with my mother. Of course, in my case, that is at least partly a result of the fact that my mother has so many fewer matches herself due to her very recent old world heritage on several lines. Don’t think though, for one minute, that you have to have parents or siblings tested for Family Finder bucketed matching to be useful. You don’t. Even second and third cousins are useful and generate bucketed maternal and paternal matches. My 361 paternal matches, all generated from 8 cousins, are testimony to that fact.

The very best thing you can do for yourself is to test the following relatives that will be used to assign your resulting matches with other people to maternal and paternal sides.

  • Your parents
  • If your parents are not both available, all of your full and half siblings
  • Your grandparents
  • Your aunts and uncles
  • Your great-aunts and great-uncles
  • All first, second and third cousins unless they are children of aunts and uncles who have already tested

The new permanent price of $79 for the Family Finder test will hopefully encourage people to test as many family members as they can find! For autosomal genetic genealogy, it’s absolutely the best gift you can give yourself – after testing yourself of course.

Concepts – Sorting Spreadsheets for Autosomal DNA

This article covers both sorting in Excel and how to identify an overlapping segment, and what that means to you as a genetic genealogist.

I swore I wasn’t going to teach Excel, but there have been so many questions about sorting Excel spreadsheets that I am going to a very basic “how to sort and not hurt yourself” article. This does NOT replace actually understanding how to use Excel, but it will at least get you through the knothole of sorting for genetic genealogy.

I wrote more about sorting and filtering in the concepts article about assigning parental sides.

There are some advanced ways to accomplish the same thing, and I’m not discussing those. If you already know how to use Excel those are fine, but this article provides the basics for those who don’t.

Sorting

I am going to use, as an example, my matches to only a few people which gives us enough information to sort, but isn’t overwhelming.

When you download your results from Family Tree DNA, your spreadsheet will be in match name order, like the spreadsheet below.

SS Raw

I want you to notice that while the primary order is by match, there is a secondary order too (chromosome), and a third (start location) and fourth (end location) as well.

Within each match, the order is by chromosome, and then by start and end location.

What this means that you can look at Alice and see that chromosome 1 is first, and that the lowest value start location is shown first within chromosome order.

That’s not the order you’ll likely be working with all the time, so let’s take a look at how to sort the spreadsheet in a different way.

The row highlighted in red contain column headers.

SS column headers

When you sort an individual column you will select the header for that column, shown below, if you’re going to sort the Matching SNPs column.

SS Column select

The cell on your spreadsheet won’t be red, but I’ve colored it red here so you can see that I’m selecting this column header and only this column header.

When you select a column header, you put the cursor on that cell and click once.

SS column select 2

The cell you’ve selected will be bordered in black.  A screen shot of my spreadsheet is shown above.

I want you to watch what happens to these two rows colored green when I sort in Matching SNP order.

SS rows green

At this point, you will click on the sort and filter button on the upper right hand side of the toolbar.

SS sort dropdown

Here’s a closeup.

SS sort dropdown closeup

Selecting the “Sort A to Z” option sorts the contents of the entire spreadsheet in Matching SNP order, smallest to largest, because that’s the column header and sort option combination you selected. I use lowest to highest (A-Z) but you can also sort in reverse order, highest to lowest (Z-A) but that isn’t terribly useful for what we will be doing.

SS SNP column sorted

Notice that all of the rows are sorted into smallest to larger order by the Matching SNP column. So while the two green rows were originally together, now the rows all appear in order by the Matching SNPs column values.

The first green row match to Alice on chromosome 3 with 1300 cMs falls between the SNP value of 850 and 1458.  The second green row with a value of 2000 falls between 1638 and 2355.  This is exactly as it should be.  The contents of the entire spreadsheet are sorted by the values in the Matching SNPs column.

The statement “sorts the contents of the entire spreadsheet” is very important, because if you perform this task incorrectly, you will bollux up your entire spreadsheet, as in irrecoverably and forever.  What follows is an example of what NOT TO DO.

DO NOT DO THIS

DO NOT, and I repeat, DO NOT select the entire column to sort.

SS - Do Not Sort

This is an example of WHAT NOT TO DO.

If you select the entire column, as shown above, then sort, here’s what happens.

SS example bad sort

Notice that the green rows are now split apart – in other words they no longer form a row from left to right. That means that ONLY the data in the Matching cM column was sorted, but not rest of the data which is still in the same location on the spreadsheet as it was before the sort. Therefore, Alice’s green row Matching cM value of 1300 is no longer with Alice, since only the data in the Matching SNPs column was sorted. Now Alice’s 1300 cMs connected to Stacy’s red row on chromosome 4. Alice now has 500 SNPs instead, which as you can see, clearly isn’t accurate.

This is what I meant by selecting the entire column instead of just the header will forever ruin your data. If you do this, there is no recovery, unless you JUST did it, SS undo
realize the error, and can selecte the blue backarrow on the top of the toolbar on the left to “undo” your action. If you’re beyond that, the only recovery is to download your data again, or move to a backup if you have one.

What’s even worse if you do this and don’t realize it, so you’re working with incorrect data trying to find overlapping segments.  Of course, everything will be wrong.  I periodically do a sanity check and look at a couple people in the chromosome browser just to make sure that everything is as it should be on my spreadsheet and I haven’t done something like this.

To Sort Correctly – DO This

To use this spreadsheet effectively for genetic genealogy, we need the spreadsheet to be sorted in this viewing order:

  • Chromosome number
  • Start location
  • End location

In other words, we need the spreadsheet to look like this with all of the green cells remaining in their row with their match:

SS example good sort

You’ll notice that all matches on each chromosome are grouped together, with the smallest start location first, as illustrated by the red groupings of chromsomes 1 and 6. I do realize these are small segments, but the process is the same for large or small segments, so for our sorting example, just ignore any genealogical relevance associated with segment size.

You will be looking for overlapping segments. Notice that you have to be cognizatnt of the end location. In the case of chromsome 1, above, there are no overlapping segments for the two chromsome one matches, so they can’t match each other on this segment.

However, on chromsome 6, we have a different situation. Stacy’s segment match with me is quite long, 104cM. Stacy’s segment overlaps with everyone else’s on chromsone 6 that matches to me, either fully or part way. She matches Alice on all of the segments fully except for the last one. Stacy’s match to me ends at 108,000,000. Alice’s last segment matches to me from 107,779,220 which is included in Stacy’s match, but Alice’s match extends beyond Stacys, to 110,175,307.

Keep in mind that we don’t know at this point whether or not Stacy and Alice are from my mother or father’s side, based on matching. In other words, to draw any conclusions, we also have to know if Stacy and Alice match each other on this segment which we can’t tell from this spreadsheet.

Because I have access to Stacy’s account, I can indeed tell you that Stacy and Alice do not match each other on this segment, so they would be from different sides of my family tree. Stacy is a known relative from my father’s side and Alice does match my mother as well, so we now know that Stacy and Alice don’t match each other.

If you don’t have access to the accounts to see if your matches match each other, two tools at Family Tree DNA are partial substitutes.

  • The ICW tool tells you if two of your matches match each other, just not on which segments.
  • The maternal/paternal Family Matching tool, if you have connected the DNA of relatives who have tested, tell you which side your matches are from, maternal or paternal.

You can read about how to use those tools here.

If there are multiple matches with the smallest start location then they will be in order by the smallest end location first, shown in the yellow cells.

Sort Order

The sort order is exactly the opposite of the viewing order. If you want to SEE the data in this order:

  • Chromosome
  • Start
  • End

Then you must sort in this order:

  • End
  • Start
  • Chromosome

The last column you sort will be the primary viewing order.

Let’s look at our spreadsheet utilizing these three steps, in order.

Step 1 – First Sort

Selecting End Location to sort:

SS sort end location

After sorting by end location, below.

SS end location sorted

You will notice that all of the data is now in order by the values in the End Location column – smallest at the top, largest at the bottom.

The data in the other columns is not in any particular order at all.

Step 2 – Second Sort

Now selecting Start Location to sort that column in order, shown below.

SS sort by start location

Having sorted by Start Location, below:

SS sorted by start location

You will notice that now all of the data is sorted by start location. In the case where there is a common start location between two rows, highlighted in red, the end row with the lower end location will show first, noted in yellow, because you sorted first by end location in smallest to largest order.

Step 3 – Third Sort

Last, you’ll select the Chromosome column header to sort in chromosome order.

Sort by chromosome

Below, the result of sorting the third time in chromsome order.  After sorting, I bordered all segments on the same chromosome.

Sorted by chromosome

You can see that the entire spreadsheet is grouped by chromsome, and within chromsome number, the Start Location is grouped smallest to largest. If there are multiple people with the same start location, then the End Location comes into play, with the smallest end location listed first, as shown in the red and yellow rows.

If you want to sort your spreadsheet in another order for some reason, you can do so using the same methodology. Once you understand about sorting spreadsheets, you understand about sorting all spreadsheets.

Now, you’re ready to look for your overlapping segments.

What is an Overlap?

An overlap is two segments of your matches that are partially or completely overlapping each other.  When you have overlapping segments, assuming they are of decent size, that indicates that the two people who match you on your spreadsheet potentially match each other too.  Remember, there are three matching possibilities:

  • Your matches will either match each other, in addition to you, because you and both of them share a common ancestor or…
  • They both match you, but they won’t match each other because one is from your mother’s side and one is from your father’s side or…
  • One or both are identical by chance.  In you need a refresher on what identical by chance, descent and population mean, click here.

Ss no overlap

In this first example, above, there is no overlap between these two people on chromosome 17.  One begins at 31,000,000 and ends at 36,000,000 while the second person’s match with you doesn’t begin until 40,000,000, which is clearly beyond the end of 36,000,000, so there is no possibility of overlaps between these two individuals.  In other words, they cannot match each other on these segments.  However, clearly they both match you because they are both on your matching spreadsheet.

SS overlap 1

In the example above, the overlapping portion of the segment is from 38,000,000 – 40,000,000.  The second person’s match with you extends to 53,000,000, but the area between 40,000,000 and 53,000,000 does not overlap.

SS overlap 2

In the example above, the start number is lower for the top row than the second row, so the overlapping area is still from 38,000,000 – 40,000,000, because the matches don’t match from 36,000,000 to 38,000,000.

SS overlap 3

Occasionally, you have an overlap that is fairly miniscule, which I generally ignore unless they are in a group that has a larger overlap that overlaps or covers both smaller matches, as in the example above. You can see that our red and yellow rows have a very small overlap from 39,500,000 – 40,000,000. However, the top row includes the entire areas of both red and yellow rows, reaching from 33,000,000 to 55,000,000 which begins before either red/yellow row and ends after both red/yellow rows.  So either all 3 individuals will match each other, indicating a common ancestor, or the top row will match one of the red/yellow rows and not the other.

Combining Spreadsheets From Different Sources

The good news is that you can download your matches into a spreadsheet format from  23andMe, Family Tree DNA and GedMatch, but you do need to understand something about the basics of sorting and how to stay out of spreadsheet trouble. I am careful about combining spreadsheets sources for a couple of reasons.

  • First, the formatting is not exactly the same, so you may need to move columns to be in the correct order for your spreadsheet before actually combining them.
  • Second, there may be overlapping people between 23andMe, Family Tree DNA and GedMatch. You’ll need to figure out how you want to deal with that, especially on an ongoing basis when you need to add to or update your spreadsheet without overwriting or eliminating your matching work and notes relative to common ancestors and ancestral lines in the columns you’ll be adding.

I always make a backup file with a date name in the file name before doing combinations, and sometimes before sorting as well.

Learning Excel

If you want to learn more about how to use Excel, here are some additional resources to utilize.

I found some training videos for Excel including “Twenty with Tessa, Tips and Suggestions for Spreadsheets” which is focused on using spreadsheets with one name studies and genetic genealogy, but the principles are the same.  https://www.youtube.com/watch?v=Ll_cfhOZTl0&feature=youtu.be

When discussing this online, one person mentioned that they joined www.lynda.com and took the basic Excel class which she found very useful.

Kitty Cooper has instructions on her blog for how to make a matches spreadsheet as well.

www.DNAadoption.com has some good courses.  Their DNA for beginners covers using spreadsheets and is not just for adoptees!

Concepts – Match Groups and Triangulation

Today, we’re going to talk about the concepts of autosomal DNA and the differences between:

  • Match groups
  • Mathematical triangulation
  • Genealogical triangulation

Match Groups

At Family Tree DNA, when you download your chromosome matching results, meaning your complete spreadsheet, then sorting into chromosome order (sort by column, end location, start location, then chromosome) your spreadsheet will look like this.

MG1

Of course, your spreadsheet will be a lot longer and will continue with additional matches on chromosome 1, then chromosome 2, etc.

In the example above, we see that this is just one match group, meaning that the segments for all individuals overlap, which indicates that they match me. In fact, I copied the first match group on my spreadsheet to use in this example.

A match group is a group of people that match YOU on the same segment of your chromosome.  You will have many match groups.

Each of these people matches me on chromosome 1 beginning at location 72,017 for some distance. The shortest match is Calvin and he matches on only 3.47 cM, which tells me that I have other matches with Calvin, because this segment is too short to have made it over the match threshold by itself. So I know this is just one of at least two segment matches to Calvin.

I don’t know Calvin, so I don’t know which side of my tree Calvin falls on or how we are related.

The next several matches’ segments are about the same size, 11 or 12 cM, with the final segment being significantly larger, 26.82 cM. If you need a refresher, I wrote a concepts article about centiMorgans and SNPs.

There are two really important things to remember about match groups.

  1. Match groups means that the people who are on this list match YOU. It does NOT means that these people match each other. In fact, if you recall, you have two sides to each chromosome, one from Dad and one from Mom, so it’s very likely that you have matches from Mom and matches from Dad, intermixed, in this and every match group. Without additional information, you have no way to discern who matches you on which side.
  2. Do not be deceived by thinking that the beginning or ending location, or both, is indicative of matching on one side or the other, or that the people who share beginning and/or ending locations match each other too. If I were to fall into this trap, I would PRESUME (and that’s a very dangerous word in DNA matching) that Cheryl through Rutha, inclusive, match each other and are from the same parental side – and I will tell you right now – they aren’t. So just don’t go there because it will trip you up sure as shootin’.

So, the people in a match group match YOU. Don’t read anything more into these matches at this point.

However, let’s move on to mathematical triangulation because there’s more that we can discover.

Mathematical Triangulation

I’ve gone and snuck a new term in on you haven’t I – mathematical triangulation. Sorry folks.

We have talked about autosomal triangulation several times before. You can read about triangulation here.

In a nutshell, triangulation requires three things of all people in a triangulation group:

  • They all match you on the same segment (math), which you know because they are all on your match list
  • They all match each other on the same segment (math), which you may or may not be able to discern by utilizing various tools
  • They have a common ancestor or ancestral line (genealogy), which you may or may not be able to discern through traditional genealogy

Triangulation is two parts math and one part genealogy. Don’t let that math word frighten you, because the math isn’t the hard part as it can be done by sorting a spreadsheet or by vendor tools, but the genealogy has to be done by you.

Mathematical triangulation divides your matches into two groups, one from your mother’s side and one from your father’s side.

Let’s step through this process and see how it works.

On any group of people who match you on a particular segment of a chromosome, when you have enough people, your matches will form two groups, plus possibly an outlier or two.

Why?

Because you have matches from both your father’s side and your mother’s side, assuming enough people have tested.

Let’s look at an example.

If you have 3 matches, it’s possible that all 3 matches are from your mother’s side.  However, as more people test and match you, eventually, you will have two groups of people form, one from each parent’s sides.

How do you know who is in each group?

Good question – and this is what defines a triangulation group versus a match group.

In a triangulation group, all of the people in the group MUST ALSO MATCH EACH OTHER on the same segment.  Yes, I’m shouting because if you forget this, you’re toast!

How do you figure out if they match each other?

In my case, I have access to the kits of the people colored peach below, because I paid for their testing, or put another way, they tested to do me a favor.

MG2

The kit in blue is managed by a cousin with whom I have a many-years long relationship, so while I don’t have direct access to this kit, I do have a great working relationship with the person involved.

So, I can sign in and I can see who matches whom. If you don’t have access to any of the kits, you can look at the ICW (in common with) list, which tells you which of these people also match each other, but the ICW list does not tell you if they match on the same segments, just that they match.  It’s not triangulation, but it is information and if the entire group matches each other utilizing the ICW tool, that’s fairly indicative that you do indeed have a mathematical triangulation group – but it’s not 100%.  You can read about the ICW tool and how to use it in this Nine Autosomal Tools at Family Tree DNA article.

So, let’s see what the two groups look like after I see who matches whom by looking at each person’s matches on this segment of chromosome 1.

MG3

I was able to check each of Cheryl, Don and Rex’s kits and they all match each other along with Jim, so we know that all 4 of these people in the green group all match each other plus me.

I was able to check Lazarus’s kit directly and Amos’s kit through my cousin, and I was also to verify that they match each other and they both also match Rutha, so I know that this purple group all matches each other plus me.

So what happened to Calvin – the uncolored row in the middle?

First, I can’t check Calvin’s kit directly, but the fact that his segments do mathematically overlap BUT he doesn’t match all of the people in either group, in fact, he doesn’t match any of the people in either group whose kits I could check, tells me that his results are zigzagging back and forth between my mother’s and my father’s DNA, from side to side.  I also verified his non-matching status to my matches utilizing the ICW tool.

This is called an identical by chance match. In essence, Calvin doesn’t match either side so it’s what is known as a false positive match – looks to be real but isn’t upon further inspection.  Remember I said that your matches would form two groups that match each other – plus a few outlier?  Calvin is an outlier.

Now, we have all of our matches sorted into two groups, a mother’s group and a father’s group, plus one who doesn’t match either group. The question is which ancestors do these matches come from and which side is mother’s and which side is father’s.

Now it’s time to add the genealogy portion as the third piece of the triangulation pie.  Sometimes, when you don’t have the ability to do mathematical triangulation, per se, by comparing individual kits, you can achieve mathematical triangulation by utilizing genealogical triangulation – so these two actually go hand in hand.

While genealogical triangulation can achieve mathematical triangulation, by organizing people into matching sides, mathematical triangulation cannot replace genealogical triangulation.

But wait, there is actually a shortcut I can take, and it is a way to begin adding genealogy immediately and easily.

Genealogical Triangulation

There are multiple ways to perform the final step in triangulation which takes your matches from mathematically matched groups to genealogically triangulated groups with ancestors or ancestral lines assigned. In fact, sometimes the genealogy will actually be what helps you with the mathematical grouping if you don’t have access to kits to check who your matches match.

Genealogy Triangulation Method 1 – Adding a Parent or Parents

My mother has tested too. In my spreadsheet, I have added her matches into my master spreadsheet so I can easily see who matches both me and my mother. A match to both of us tells me immediately which side the match is on.

MG4

My mother’s matches are colored pink. You can see immediately that Mom and I both match Cheryl, Don, Rex and Jim, so those matches are assigned to my mother’s side.

Genealogical Triangulation Method 2 – Using Known Individuals and Identifying Common Ancestors or Ancestral Lines

As it turns out, I already know who Cheryl, Don and Rex are, so I know that they match on my mother’s side, even without my mother’s DNA test. Cheryl and Don are siblings who are my mother’s first cousins, and Rex is a second cousin. All of these people match both me and Mom on the same segments, which means that these matches come from my mother’s side.  It also means that I received this entire segment intact from my mother, without being divided.  Utilizing close relatives to sort matches into groups is exactly why we encourage everyone to test as many known relatives as you can convince to test, except children of relatives who have already tested, because their children only received a part of the parents’ DNA.

Furthermore, it also means that Jim, whose genealogy I don’t know, is from the same line because he matches Cheryl, Don, Rex and mother as well.

That does not means that Jim necessarily shares the same most recent common ancestors (MRCA) as Cheryl, Don, Rex and mother.

Even if I didn’t have my mother’s information, knowing her relationship to Cheryl, Don and Rex along with mathematical triangulation is enough to assign these relatives and people they all match to my mother’s side.  Even if I don’t have access to their accounts, I still know that they are closely related to my mother, there are multiple of them, and they all match me on the same segment, so I can group these people together, if nothing more.

Margaret Lentz chart

The common ancestors of Barbara (mother), Don and Cheryl are Evaline Miller and Hiram Ferverda.

The common ancestors of Barbara, Don, Cheryl and Rex are Margaret Lentz and John David Miller.

So while these individuals do share a common ancestor, and we can identify who it is, the most recent common ancestor is different between Rex and Cheryl, Don, and Barbara.

We don’t know who Jim is, but I can pretty much tell you that he isn’t a descendant of Evaline Miller and Hiram Ferverda, unless he is through an unknown child. I can be pretty certain that he’s not a direct descendant of Margaret Lentz and John David Miller either.

I can also tell you with equal conviction that he IS descended from either the Margaret Lentz or John David Miller line, because he matches all 4 cousins who descend from that couple – Cheryl, Don, Barbara and Rex – on the same reasonably large segment. So while we can’t identify his common ancestor with the group, at least not yet, we can say with certainly that he descends from either these common ancestors or an ancestor to these ancestors.

Now, if Jim also matched to William Lentz, above, which he doesn’t, but let’s say he did – we would then know which side of the Margaret Lentz and John David Miller line Jim represented. The Lentz line, of course.

This would also tell us that if Jim matched the Lentz line, that the DNA he shares with Barbara, Rex, Don and Cheryl was from Margaret Lentz, so descended from her parents, Jacob Lentz and Fredericka Reuhle. Of course, we don’t know that today – but all it takes is the right “new match” whose genealogy is proven and we can then attribute the individual segments to specific ancestors.

Let’s add mother’s lines into the mathematically matched chart.

MG5

As you can see, just as expected, Mom matches all of the same people that match each other, along with me, in the green match group, which is now a triangulation group because we know which side the match is on – Mom’s.

To add more definition to the triangulation group, we need genealogical information about the people in the group so that we know who their common ancestors are, or their common line. Fortunately, we have that.

You can also see that Mom matches Lisa as well, but neither Cheryl, Don, nor Rex match Lisa, so Lisa must be from Mom’s mother’s side AND I didn’t inherit that DNA from Mom because I don’t match Lisa either. That’s good information to know through deductive reasoning. It’s also possible of course that Lisa’s match to Mom is IBC, identical by chance, but at almost 14 cM, that’s rather unlikely.

I’ve updated the “side” column with what we’ve learned.

Lastly, given that I do know the genealogy of many of these people, I’ve added that information into additional columns on my spreadsheet, along with the fact that these segments do in fact triangulate. Please note that you can click to see a larger image.

MG6

Now, if you’ve just caught the words “these segments do in fact triangulate” and you’re about to ask if each matching segment needs to be triangulated individually – the answer would be yes. You may share multiple ancestors with someone, on both sides of your family. In fact, even worse you can share multiple ancestors on each side of your family. Endogamy will do that to you.

We’ll pause a minute here for the groaning to subside.

One last comment is that when I infer a side, like with Lisa who does not match on Mom’s father’s side on this segments, I don’t assign the side, I just make a note because we really CAN’T say that Lisa matches on my Mom’s mother’s side, because she might be a false positive match.

Also, in the case of Rutha, we know she descends from either one of these common ancestors or an ancestor of an ancestor, so I simply note the group she triangulates with for further reference.  That information about Rutha will come in useful when I work with other match groups that she is a member of – trying to make them into triangulation groups as well.

Genealogical Triangulation Method 3 – Phased Family Matches

You can also check for your phased Family Matches on your match page at Family Tree DNA to see if any of those individuals who match you on your spreadsheet are already assigned to the maternal or paternal sides based on phased matches with qualifying relatives. You can see on the page below that indeed, on my mother’s kit, Cheryl is assigned to her parental bucket, being her third closest match.

MG7

However, you CAN’T assume that because a match doesn’t have a maternal or paternal icon assigned that they aren’t descended from a particular side of your family.

Family Tree DNA only assigns high confidence phased matches so that you can depend on those results.

Remember, each segment needs to be individually triangulated, and the Family Matching algorithm that assigns maternal and paternal icons has a higher threshold and other internal requirements that may cause a parental icon NOT to be displayed when the match IS from that side of the family.  This is not a bug but a design element that assures that only highly confident matches are parentally assigned.

So you can use the parental icon as a great tool to assign a genealogical maternal or paternal side, but you still need to do due diligence in terms of working each segment to identify with whom it triangulates and the common ancestors.

Genealogical Triangulation Method 4 – Don’t Forget the X

While utilizing this trick won’t get you all the way to triangulation, it will help in several cases, at least by assigning parental “sides” to some matches – for males only.  Yes, ladies, I know, it’s not fair.

Because males inherit an X chromosome only from their mother, and a Y from their father, any match that is labeled as an X match:

  • Has to have come from a male’s maternal line if the segment is a valid match.
  • Has to have a segment on chromosomes 1-22 that is over the matching threshold for an X match to be reported.

Aside from that, the X is subject to the same segment size considerations of all other chromosomes and segments.

Any X match of a reasonable size, meaning one that is less likely to be identical by chance, had to have come from a male’s mother’s line – so a match to a male with a reasonably large X segment is an indication of a maternal line match for him, at least on that segment.  For two matching men, an X match has to be a maternal line match for both men.  However, keep in mind that I have seen X segments that match on completely different lines than autosomal matches.

When comparing the X chromosome, in non-endogamous populations, I would certainly note matches over 3cM. I would not assign a maternal “side” unless the match was 7cM or over and 500 SNPs or more.  Lastly, in endogamous populations, I would be even more restrictive in terms of assigning the segment as “real,” but I would make notes because it would help focus where I look.

MG8

As you can see, this gentleman only has 4 X matches, and two of those are quite small, One is near 5cM and one is near 7cM, but none of the 4 is compellingly large – meaning they could be identical by chance. I would make a note for the two larger matches by the names of the people he matches, but I would not assign any of the matches as maternal at this point with given the small segments.  When I make “side” assignments, I want them to be as strong as possible so I don’t have to second guess the assignment later.

I would also look at who else I match on the X on that segment and see if there is anything remarkable about common matches.  In males, if the X match is valid, it HAS to come from mother’s side, so if you see a small segment X match in someone you know is related on your father’s side, that’s an indication that the X match on that segment to that person is IBC or you also share a second ancestor on the maternal side.

Larger X matches are already mathematically triangulated for you, meaning, if you’re a male,  you know what side they come from.  There are no “2 sides” to this chromosome and all you need to add is genealogy.  Women, you still have two sides, because you inherited an X from both your mother and your father, so you are not mathematically triangulated.  Sorry ladies!

If you would like to read more about the X chromosome, inheritance patterns and matching, click here and here. Be sure to utilize the inheritance pattern genealogy charts.

Summary

See how easy and fun this is when you break it down into easy, logical steps.

Even if you can’t mathematically triangulate, if you can find multiple people in the match group that descend from the same known ancestors or ancestral line, you can form smaller groups with these individuals until you have the opportunity to create a larger mathematical match group.

It only takes three people to create a genealogical triangulation group, so long as they aren’t close relatives.  Siblings don’t count, for example, and neither do parents when counting to 3 in a triangulation group.

As long as you can identify who your DNA on a particular segment came from, you really don’t need to fit everyone on your match list into a triangulation group – so if you don’t have access to some accounts, it’s not the end of the world.  You can also use the ICW tool to determine who people match, in general, if not segment specific..

You’re only going to have two ancestral lines, a paternal line and a maternal line that is relevant to any match group – so once you know who those ancestors are, the rest of your matches HAVE to fall into one group or another, or are outliers.  So don’t obsess about not being able to fit everyone into a match group either mathematically or genealogically.  However, if you can find a genealogical connection, by all means do, because one of those matches may well be the person who isolates the DNA to either the male or female of the ancestral couple and allows you to go back another generation or two in time.

Don’t forget about utilizing your Family Matches with assigned paternal and maternal icons.  That’s a great new tool.

It’s fascinating to see which of our ancestors our DNA came from. Finding new cousins by utilizing DNA is exciting as well – and gives us new opportunities to establish family relationships and share research information – opportunities that never existed prior to DNA bringing us together and providing us with that all important introduction.

Have fun.