AutoKinship at GEDmatch by Genetic Affairs

Genetic Affairs has created a new version of AutoKinship at GEDmatch. The new AutoKinship report adds new features, allows for more kits to be included in the analysis, and integrates multiple reports together:

  • AutoCluster – the autoclusters we all know and love
  • AutoSegment – clusters based on segments
  • AutoTree – reconstructed tree based on GEDCOM files of you and your matches, even if you don’t have a tree
  • AutoKinship – the original AutoKinship report provided genetic trees. The new AutoKinship report includes AutoTree, combines both, and adds features called AutoKinship Tree. (Trust me on this one – you’ll see in a minute!)
  • Matches
    • Common Ancestors with your ancestors
    • Common Ancestors between matches, even if they don’t match your tree
    • Common Locations

Maybe the best news is that some reports provide automatic triangulation because, at GEDmatch, it’s possible to not only see how you match multiple people, but also if those people match each other on that same segment. Of course, triangulation requires three-way matching in addition to the identification of common ancestors which is part of what AutoKinship provides, in multiple ways.

Let’s step through the included reports and features one at a time, using my clusters as an example.

Order Your Report

As a Tier 1 GEDmatch customer, sign in, select AutoKinship and order your report.

Note that there are now two clustering settings, the default setting and one that will provide more dense clusters. The last setting is the default setting for AutoKinship, since it has been shown to produce better AutoKinship results.

You can also select the number of kits to consider. Since this tool is free with a GEDmatch Tier 1 subscription, you can start small and rerun if you wish, as often as you wish.

Currently, a maximum of 500 matches can be included, but that will be increased to 1000 in the future. Your top 500 matches will be included that fall within the cM matching parameters specified.

I’m leaving this at the maximum 400 cM threshold, so every match below that is included. I generally leave this default threshold because otherwise my closest matches will be in a huge number of clusters which may cause processing issues.

For a special use case where you will want to increase the cM threshold, see the Special Use Cases section near the end of this article.

You can select a low number of matches, like 25 or 50 which is particularly useful if you want to examine the closest matches of a kit without a tree.

Keep in mind that there is currently a maximum processing time of 10 minutes allowed per report. This means that if you have large clusters, which are the last ones processed, you may not have AutoKinship results for those clusters.

This also means that if you select a high cM threshold and include all 500 allowable matches, you will receive the report but the AutoKinship results may not be complete.

When finished, your report will be delivered to you as a download link with an attached zipped file which you will need to save someplace where you can find it.

Unzip

If you’re a PC user, you’ll need to unzip or extract the files before you can use the files. You’ll see the zipper on the file.

If you don’t extract the contents, you can click on the file to open which will display a list of the files, so it looks like the files are extracted, but they aren’t.

You can see that the file is still zipped.

You can click on the html file which will display the AutoCluster correctly too, but when you click on any other link within that file, you’ll receive this error message if the file is still zipped.

If this happens to you, it means the file is still zipped. Close the files you have open, right click on the yellow zipped file folder and “extract all.”

Then click on the HTML link again and everything should work.

Ok, on to the fun part – the tools.

Tools

I’ve written about most of these tools individually before, except for the new combinations of course. I’ve put all of the Genetic Affairs Tools, Instructions and Resources in one article that you can find here.

I recommend that you take a look to be sure you’re using each tool to its greatest advantage.

AutoCluster

Click on the html file and watch your AutoCluster fly into place. I always, always love this part.

The first thing I noticed about my AutoCluster at GEDmatch is that it’s HUGE! I have a total of 144 clusters and that’s just amazing!

Information about the cluster file, including the number of matches, maximum and minimum cM used for the report, and minimum cluster size appears beneath your cluster chart.

22 people met the criteria but didn’t have other matches that did, so they are listed for my review, but not included in the cluster chart.

At first glance, the clusters look small, but don’t despair, they really aren’t.

My clusters only look small because the tool was VERY successful, and I have many matches in my clusters. The chart has to be scaled to be able to display on a computer monitor.

New Layout

Genetic Affairs has introduced a new layout for the various included tools.

Each section opens to provide a brief description of the tool and what is occurring. This new tool includes four previous tools plus a new one, AutoCluster Tree, as follows:

AutoCluster

AutoCluster first organizes your DNA matches into shared match clusters that likely represent branches of your family. Everyone in a cluster will likely be on the same ancestral line, although the MRCA between any of the matches and between you and any match may vary. The generational level of the clusters may vary as well. One may be your paternal grandmother’s branch, another may be your paternal grandfather’s father’s branch.

AutoSegment

AutoSegment organizes your matches based on triangulating segments. AutoSegment employs the positional information of segments (chromosome and start and stop position) to identify overlapping segments in order to link DNA matches. In addition, triangulated data is used to collaborate these links. Using the user defined minimum overlap of a DNA segment we perform a clustering of overlapping DNA segments to identify segment clusters. The overlap is calculated in centimorgans using human genetic recombination maps. Another aspect of overlapping segments is the fact that some regions of our genome seem to have more matches as compared to the other regions. These so-called pile-up areas can influence the clustering. The removal of known pile-up regions based on the paper of Li et al 2014 is optional and is not performed for this analysis However, a pileup report is provided that allows you to examine your genome for pileup regions.

AutoTree

By comparing the tree of the tested person and the trees from the members of a certain cluster, we can identify ancestors that are common amongst those trees. First, we collect the surnames that are present in the trees and create a network using the similarity between surnames. Next, we perform a clustering on this network to identify clusters of similar surnames. A similar clustering is performed based on a network using the first names of members of each surname cluster. Our last clustering uses the birth and death years of members of a cluster to find similar persons. As a consequence, initially large clusters (based on the surnames) are divided up into smaller clusters using the first name and birth/death year clustering.

AutoKinship

AutoKinship automatically predicts family trees based on the amount of DNA your DNA matches share with you and each other. Note that AutoKinship does not require any known genealogical trees from your DNA matches. Instead, AutoKinship looks at the predicted relationships between your DNA matches, and calculates many different paths you could all be related to each other. The probabilities used by this AutoKinship analysis are based on simulated data for GEDmatch matches and are kindly provided by Brit Nicholson (methodology described here). Based on the shared cM data between shared matches, we create different trees based on the putative relationships. We then use the probabilities to test every scenario which are then ranked.

AutoKinship Tree

Predicted trees from the AutoTree analysis are based on genealogical trees shared by the DNA matches and, if available, shared by the tested person. The relationships between DNA matches based on their common ancestors as provided AutoTree are used to perform an AutoKinship analysis and are overlayed on the predicted AutoKinship tree.

AutoKinship Tree is New

AutoKinship Tree is the new feature that combines the features of both AutoTree and AutoKinship. You receive:

  • Common ancestors between you and your matches
  • Trees of people who don’t share your common ancestors but share ancestors with each other
  • Combined with relationship predictions and
  • A segment analysis

Of course, the relative success of the tree tools depends upon how many people have uploaded GEDCOM files.

Big hint, if you haven’t uploaded your family tree, do so now. If you are an adoptee or searching for a parent and don’t know who your ancestors are, AutoKinship Tree does its best without your tree information, and you will still benefit from the trees of others combined with predicted relationships based on DNA.

It’s easier to show you than to tell you, so let’s step through my results one section at a time.

I’m going to be using cluster 5 which has 32 members and cluster 136 which has 8 members. Ironically, cluster 136 is a much more useful cluster, with 8 good matches, than cluster 5 which includes 32 people.

Results of the AutoKinship Analyses

As you scroll down your results, you’ll see a grid beneath the Explanation area.

It’s easy to see which cluster received results for each tool. My cluster 5 has results in each category, along with surnames. (Notice that you can search for surnames which displays only the clusters that contain that surname.)

I can click on each icon to see what’s there waiting for me.

Additionally, you can click at the top on the blue middle “here” for an overview of all common ancestors. Who can resist that, right?

Click on the ancestor’s name or the tree link to view more information.

You can also view common locations too by clicking on the blue “here” at far right. A location, all by itself, is a HUGE hint.

Clicking on the tree link shows you the tree of the tester with ancestors at that location. I had several others from North Carolina, generally, and other locations specifically. Let’s take a look at a few examples.

Common Ancestor Clusters

Click on the first blue link to view all common ancestors.

Common Ancestor Clusters summarize all of the clusters by ancestor. In other words, if any of your matches have ancestors in common in their tree, they are listed here.

These clusters include NOT just the people who share ancestors in a tree with you, but who also share known ancestors with each other BUT NOT YOU. That may be incredibly important when you are trying to identify your ancestors – as in brick walls. Your ancestors may be their ancestors too, or your common segments might lead to your common ancestors if you complete their tree.

There are other important hints too.

In my case, above, Jacob Lentz is my known ancestor.

However, Sarah Barron is not my ancestor, nor is John Vincent Dodson. They are the descendants of my Dodson ancestor though. I recognized that surname and those people. In other instances, recognizing a common geography may be your clue for figuring out how you connect.

In the cluster column at left, you can see the cluster number in which these people are found.

Common Locations Table

Clicking on the second link provides a Common Location Table

Some locations are general, like a state, and others are town, county or even village names. Whatever people have included in their GEDCOM files that can be connected.

Looking at this first entry, I recognize some of the ancestral surnames of Karen’s ancestors. The fact that we are found in the same cluster and share DNA indicates a common ancestor someplace.

Check for this same person in additional locations, then, look at their tree.

Ok, back to the AutoKinship Analysis Table and Cluster 136.

Cluster 136

I’m going to use Cluster 136 as an example because this cluster has generated great reports using all of the tools, indicated by the icon under each column heading. Some clusters won’t have enough information for everything so the tools generate as much as possible.

Scrolling down to Cluster 136 in the AutoCluster Information report, just beneath the list of clusters, I can see my 8 matches in that cluster.

Of course, I can click on the links for specific information, or contact them via email. At the end of this article in the “Tell Me Everything” section, I’ll provide a way to retrieve as much information as possible about any one match. For now, let’s move to the AutoTree.

Cluster 136 AutoTree

Clicking on the icon under AutoTree shows me how two of the matches in this cluster are related to each other and myself.

Note that the centimorgan badges listed refer to the number of cM that I share with each of these people, not how much they share with each other.

Click on any of the people to see additional information.

When I click on J Lentz m F Moselman, a popup box shows me how this couple is related to me and my matches.

Of course, you can also view the Y DNA or mitochondrial DNA haplogroups if the testers have provided that information when they set up their GEDmatch profile information.

Just click on the little icons.

If the testers have not provided that information, you can always check at FamilyTreeDNA or 23andMe, if they have tested at either of those vendors, to view their haplogroup information.

Today, GEDmatch kit numbers are assigned randomly, but in the early days, before Genesis, the leading letter of A meant AncestryDNA, F or T for FamilyTreeDNA, M for 23andMe and H for MyHeritage. If the kit number is something else, perform a one-to-one or a one-to-many report which will display the source of their DNA file.

The small number, 136 in this case, beside the cM number indicates the cluster or clusters that these people are members of. Some people are members of multiple clusters

Let’s see what’s next.

Cluster 136 Common Ancestors

Clicking on the Ancestors icon provides a report that shows all of the Ancestor Clusters in cluster 136.

The difference between this ancestor chart and the larger chart is that this only shows ancestors for cluster 136, while the larger chart shows ancestors for the entire AutoCluster report.

Cluster 136 Locations

All of the locations shown are included in trees of people who cluster together in cluster 136. Of course, this does NOT mean that these locations are all relevant to cluster 136. However, finding my own tree listed might provide an important clue.

Using the location tool, I discover 5 separate location clusters. This location cluster includes me with each tester’s ancestors who are found in Montgomery County, Ohio.

The difference between this chart for cluster 136 only and the larger location chart is that every location in this chart is relevant for people who all cluster together meaning we all share some ancestral line.

Viewing the trees of other people in the cluster may suggest ancestors or locations that are essential for breaking down brick walls.

Cluster 136 AutoKinship

Clicking on the anchor in the AutoKinship column provides a genetically reconstructed tree based on how closely each of the people match me, and each other. Clearly, in order to be able to provide this prediction, information about how your matches also match each other, or don’t, is required.

Again, the cM amount shown is the cM match with me, not with each other. However, if you click on a match, a popup will be shown that shows the shared cM between that person and the other matches as well as the relationship prediction between them in this tree

So, Bill matches David with a total of 354.3 cM and they are positioned as first cousins once removed in this tree. The probability of the match being a 1C1R (first cousin once removed) is 64.9%, meaning of course that other relationships are possible.

Note that Bill and David ALSO share a segment with me in autosegment cluster 185, on chromosome 3.

It’s important to note that while 136 is the autocluster number, meaning that colored block on the report, WITHIN clusters, autosegment clusters are formed and numbered. 

Each autosegment cluster receives its own number and the numbers are for the entire report. You will have more autosegment clusters than autoclusters, because at least some of the colorful autoclusters will contain more than one segment cluster.

Remember, autoclusters are those colorful boxes of matches that fly into place. Autosegment clusters are the matching triangulated clusters on chromosomes and they are represented by the blue bars, shown below.

AutoCluster 136 contains 5 different autosegment clusters, but Bill is only included in one of those autosegment clusters.

You’ll notice that there are some people, like Robin at the bottom, who do match some other people in the cluster, but either not enough people, or not enough overlapping DNA to be included as an autocluster member.

The small colored chromosomes with numbers, boxed in red, indicate the chromosome on which this person matches me.

If you click on that chromosome icon, you’ll see a popup detailing everyone who matches me on that segment.

Note that in some cases a member of a segment cluster, like Robin, did not make it in the AutoCluster cluster. You can spot these occurrences by scrolling down and looking at the cluster column which will then be empty for that particular match.

Reconstructed AutoKinship Trees in Most Likely Order

Scrolling down the page, next we see that we have multiple possible trees to view. We are shown the most likely tree first.

Tree likelihood is constructed based on the combined probability of my matching cM to an individual plus their likely relationship to each other based on the amount of DNA they share with each other as well.

In my case, all of the first 8 trees are equally as likely to be accurate, based on autosomal genetic relationships only. The ninth tree is only very slightly less likely to be accurate.

The X chromosome is not utilized separately in this analysis, nor are Y or mitochondrial DNA haplogroups if provided.

DNA Relationship Matrix

Continuing to scroll down, we next see the DNA matrix that shows relationships for cluster 5 in a grid format. Click on “Download Relationship Matrix” to view in a spreadsheet.

Keep scrolling for the next view which is the Individual Segment Cluster Information

Individual Segment Cluster Information

Remember that we are still focused on only one cluster – in this case, cluster 136. Each cluster contains people who all match at least some subset of other people in the cluster. Some people will match each other and the tested person on the same chromosome segment, and some won’t. What we generally see within clusters are “subclusters” of people who match each other on different chromosomes and segments. Also, some matches from cluster 136 might match other people but those matches might not be a member of cluster 136.

In autocluster 136, I have 14 DNA segments that converge into 5 segment clusters with my matches. Here’s segment cluster 185 that consists of two people in addition to me. Note that for individuals to be included in these segment clusters at GEDmatch, they must triangulate with people in the same segment cluster.

From left to right, we see the following information:

  • AutoCluster number 136, shown below

  • Segment cluster 185. This is a segment cluster within autocluster 136.

  • Segment cluster 185 occurs on chromosome 3, between the designated start and stop locations.
  • The segment representation shows the overlapping portions of the two matches, to me. You can easily see that they overlap almost exactly with each other as well.
  • The SNP count is shown, followed by the name and cM count.

Cluster 136 AutoKinship Tree

The AutoKinship Tree column is different from the AutoKinship column in one fundamental way. The new AutoKinship Tree feature combines the genealogical AutoTree and the genetic AutoKinship output together in one report.

You can see that the “prior” genealogical tree information that one of my matches also descends from Jacob Lentz (and wife, if you click further) has now been included. The matches without trees have been reconstructed around the known genealogy based on how they match me and each other.

I was already aware of how I’m related to Bill, David, *C and *R, but I don’t know how I am related to these other people. Based on their kit identifier, I can go to the vendor where they tested and utilize tools there, and I can check to see if they have uploaded their DNA files elsewhere to discover additional records information or critical matches. Now at least I know where in the tree to search.

Cluster 136 AutoSegment

Clicking on AutoSegment provides you with segment information. Each cluster is painted on your chromosomes.

By hovering over the darkly colored segments, which are segment clusters, you can view who you match, although to view multiple matches, continue scrolling.

In the next section, you’ll see the two segment clusters contained wholly within cluster 136.

Following that is the same information for segment clusters partially linked to cluster 136, but not contained wholly within 136.

Bonus – Tell Me Everything – Individual Match Clusters

We’ve focused specifically on the AutoKinship tools, but if you’re interested in “everything” about one specific match, you can approach things from that perspective too. I often look at a cluster, then focus on individuals, beginning with those I can identify which focuses my search.

If you click on any person in your match list, you’ll receive a report focusing on that person in your autocluster.

Let’s use cousin Bill as an example. I know how he’s related to me.

You can choose to display your chosen cluster by:

  • Cluster
  • Number of shared matches
  • Shared cM with the tester
  • Name

I would suggest experimenting with all of the options and see which one displays information that is most useful to the question you’re trying to answer.

Beneath the cluster for Bill, you’ll see the relevant information about the cluster itself. Bill has cluster matches on two different chromosomes.

The AutoCluster Cluster member Information report shows you how much DNA each cluster member shares with the tested person, which is me, and with each other cluster member. It’s easy to see at a glance who Bill is most closely related to by the number of cMs shared.

Only one of Bill’s chromosomes, #3, is included in clusters, but this tells me immediately that this/these segments on chromosome 3 triangulate between me, Bill, and at least one other person.

Segments shown in orange (chromosome 22) match me, but are not included in a cluster.

Special Use Cases – Unknown People

For adoptees and people trying to figure out how they are related to closer relatives, especially those without a tree, this new combined AutoKinship tool is wonderful.

400 cM is the upper default limit when running the report, meaning that close family members will not be included because they would be included in many clusters. However, you can make a different selection. If you’re trying to determine how several closely related people intersect, select a high threshold to include everyone.

Select a lower number of matches, like 25 or 50.

In this example, ‘no limit” was selected as the upper total match threshold and 25 closest matches.

AutoKinship then constructs a genetic tree and tells you which trees are possible and most likely. If some people do have trees, that common ancestor information would be included as well.

Note that when matches occur over the 400 cM threshold, there will be too many common chromosome matches so the chromosome numbers are omitted. Just check the other reports.

This tool would have helped a great deal with a recent close match who didn’t know how they are related to my family.

You can see this methodology in action and judge its accuracy by reconstructing your own family, assuming some of your known family members have uploaded to GEDmatch. Try it out.

It’s a Lot!

I know there’s a lot here to absorb, but take your time and refer back to this article as needed.

This flexible new tool combines DNA matching, genealogy trees, genetic trees, locations, autoclusters, a chromosome browser, and triangulation. It took me a few passes and working with different clusters to understand and absorb the information that is being provided.

For people who don’t know who their parents or close relatives are, these tools are amazing. Not only can they determine who they are related to, and who is related to each other, but with the use of trees, they can view common ancestors which provides possible ancestors for them too.

For people painting their triangulated segments at DNAPainter, AutoKinship provides triangulation groups that can be automatically painted using the Cluster Auto Painter, here, plus helps to identify that common ancestor. You can read more about DNAPainter, here.

For people seeking to break down brick walls, AutoKinship Tree provides assistance by providing tree matching between your matches for common ancestors NOT IN YOUR TREE, but that ARE in theirs. Your brick walls are clearly not (yet) identified in your tree, although that’s our fervent hope, right?

Even if your matches’ trees don’t go far enough back, as a genealogist, you can extend those trees further to hopefully reveal a previously unknown common ancestor.

The Best Things You Can Do

Aside from DNA testing, the three best things you can do to help yourself, and your clusters are:

  • Upload your GEDCOM file, complete with locations, so you have readily available trees. Ask your matches to do so as well. Trees help you and others too.
  • Encourage people you match at Ancestry who provides no chromosome segment information or chromosome browser to upload a copy of their DNA files and tree.
  • Test your family members and cousins, and encourage them to upload their DNA and their trees. Offer to assist them. You can find step-by-step download/upload instructions here.

Have fun!

______________________________________________________________

Sign Up Now – It’s Free!

If you enjoyed this article, subscribe to DNAeXplain for free to automatically receive new articles by email each week.

Here’s the link. Just look for the little grey “follow” button on the right-hand side on your computer screen below the black title bar, enter your e-mail address, and you’re good to go!

In case you were wondering, I never have nor ever will share or use your e-mail outside of the intended purpose.

Share the Love

You can always forward these articles to friends or share by posting links on social media. Who do you know that might be interested?

_____________________________________________________________

Follow DNAexplain on Facebook, here or follow me on Twitter, here.

Share the Love!

You’re always welcome to forward articles or links to friends and share on social media.

If you haven’t already subscribed (it’s free,) you can receive an email whenever I publish by clicking the “follow” button on the main blog page, here.

You Can Help Keep This Blog Free

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Uploads

Genealogy Products and Services

My Book

Genealogy Books

Genealogy Research

Identify Your Ancestors – Follow Nested Ancestral Segments

I don’t think that we actively think about our DNA segments as nested ancestors, like Russian Matryoshka dolls, but they are.

That’s exactly why segment information is critical for genealogists. Every segment, and every portion of a segment, has an incredibly important history. In fact, you could say that the further back in time we can track a segment, the more important it becomes.

Let’s see how to unveil nested segments. I’ll use my chromosome 20 as an example because it’s a smaller chromosome. But first, let’s start with my pedigree chart.

Pedigree

Click images to enlarge.

Before we talk about nested segments that originated with specific ancestors, it’s important to take a look at the closest portion of my maternal pedigree chart. My DNA segments came from and through these people. I’ll be working with the first 5 generations, beginning with my mother as generation #1.

Generation 1 – Parents

In the first generation, we receive a copy of each chromosome from each parent. I have a copy of chromosome 20 from my mother and a copy from my father.

At FamilyTreeDNA, you can see that I match my mother on the entire tested region of each chromosome.

Therefore, the entire length of each of my chromosomes is assigned to both mother and father because I received a copy from each parent. I’m fortunate that my mother’s DNA was able to be tested before she passed away.

We see that each copy of chromosome 20 is a total of 110.20 cM long with 17,695 SNPs.

Of course, my mother inherited the DNA on her chromosome 20 from multiple ancestors whose DNA combined in her parents, a portion of which was inherited by my mother. Mom received one chromosome from each of her parents.

I inherited only one copy of each chromosome (In this case, chromosome 20) from Mom, so the DNA of her two parents was divided and recombined so that I inherited a portion of my maternal chromosome 20 from both of my maternal grandparents.

Identifying Maternal and Paternal Matches

Associating matches with your maternal or paternal side is easy at FamilyTreeDNA because their Family Finder matching does it automatically for you if you upload (or create) a tree and link matches that you can identify to their proper place in your tree.

FamilyTreeDNA then uses that matching segment information from known, identified relatives in your tree to place people who match you both on at least one significant-sized segment in the correct maternal, paternal, (or both) buckets. That’s triangulation, and it happens automatically. All you have to do is click on the Maternal tab to view your triangulated maternal matches. As you can see, I have 1432 matches identified as maternal. 

Some other DNA testing companies and third-party tools provide segment information and various types of triangulation information, but they aren’t automated for your entire match list like Family Finder matching at FamilyTreeDNA.

You can read about triangulation in action at MyHeritage, here, 23andMe, here, GEDmatch, here, and DNAPainter, which we’ll use, here. Genetic Affairs AutoKinship tool incorporates triangulation, as does their AutoSegment Triangulation Cluster Tool at GEDmatch. I’ve compiled a reference resource for triangulation, here.

Every DNA testing vendor has people in their database that haven’t tested anyplace else. Your best strategy for finding nested segments and identifying matches to specific ancestors is to test at or transfer your DNA file to every vendor plus GEDmatch where people who test at Ancestry sometimes upload for matching. Ancestry does not provide segment information or a chromosome browser so you’ll sometimes find Ancestry testers have uploaded to GEDmatch, FamilyTreeDNA  or MyHeritage where segment information is readily available. I’ve created step-by-step download/upload instructions for all vendors, here.

Generation 2 – Grandparents

In the second generation, meaning that of my grandparents, I inherited portions of my maternal and paternal grandmother’s and grandfather’s chromosomes.

My maternal and paternal chromosomes can be divided into two pieces or groups each, one for each grandparent.

Using DNAPainter, we can see my father’s chromosome 20 on top and my mother’s on the bottom. I have previously identified segments assigned to specific ancestors which are represented by different colors on these chromosomes. You can read more about how to use DNAPainter, here.

We can divide the DNA inherited from each parent into the DNA inherited from each grandparent based on the trees of people we match. If we test cousins from each side, assigning segments maternally or paternally becomes much, much easier. That’s exactly why I’ve tested several.

For the rest of this article, I’m focusing only on my mother’s side because the concepts and methods are the same regardless of whether you’re working on your maternal side or your paternal side.

Using DNAPainter, I expanded my mother’s chromosome 20 in order to see all of the people I’ve painted on my mother’s side.

DNAPainter allows us to paint matching segments from multiple testing vendors and assign them to specific ancestors as we identify common ancestors with our matches.

Based on these matches, I’ve divided these maternal matches into two categories:

  • Maternal grandmother, meaning my mother’s mother, bracketed in red boxes
  • Maternal grandfather, meaning my mother’s father, bracketed in black boxes.

The text and arrows in these graphics refer to the colors of the brackets/boxes, and NOT the colors of the segments beside people’s names. For example, if you look at the large black box at far right, you’ll see several people, with their matching segments identified by multiple colored bars. The different colored segments (bars) mean I’ve associated the match with different ancestors in multiple or various levels of generations.

Generation 3 – Great-grandparents

Within those maternal and paternal grandparent segments, more nested information is available.

The black Ferverda grandfather segments are further divided into black, from Hiram Ferverda, and gold from his wife Eva Miller. The same concept applies to the red grandmother segments which are now divided into red representing Nora Kirsch and purple representing Curtis Lore, her husband.

While I have only been able to assign the first four segments (at the top) to one person/ancestor, there’s an entire group of matches who share the grouping of segments at right, in gold, descended through Eva Miller. The Miller line is Brethren and Mennonite with lots of testers, so this is a common pattern in my DNA matches.

Eva Miller, the gold ancestor, has two parents, Margaret Elizabeth Lentz and John David Miller, so her segments would come from those two sides.

Generation 4 and 5 – Fuschia Segment

I was able to track the segment shown in fuschia indicated by the blue arrow to Jacob Lentz and his wife Fredericka Ruhle, German immigrant ancestors. Other people in this same match (triangulation) group descend from Margaret Elizabeth Lentz and John David Miller – but that fuschia match is the one that shows us where that segment originated. This allows us to assign that entire gold/blue bracketed set of segments to a specific ancestor or ancestral couple because they triangulate, meaning they all match me and each other.

Therefore, all of the segments that match with the fuschia segment also track back to Jacob Lentz and Fredericka Ruhle, or to their ancestors. We would need people who descend from Jacob’s parents and/or Fredericka’s parents to determine the origins of that segment.

In other words, we know all of these people share a common source of that segment, even if we don’t yet know exactly who that common ancestor was or when they lived. That’s what the process of tracking back discovers.

To be very clear, I received that segment through Jacob and Fredericka, but some of those matches who I have not been able to associate with either Jacob or Fredericka may descend from either Jacob or Fredericka’s ancestors, not Jacob and Fredericka themselves. Connecting the dots between Jacob/Fredericka and their ancestors may be enlightening as to the even older source of that segment.

Let’s take a look at nested segments on my pedigree chart.

Nested Pedigree

Click to enlarge.

You can see the progression of nesting on my pedigree chart, using the same colors for the brackets/boxes. The black Ferverda box at the grandparent level encompasses the entire paternal side of my mother’s ancestry, and the red includes her mother’s entire side. This is identical to the DNAPainter graphic, just expressed on my pedigree chart instead of my chromosome 20.

Then the black gets broken into smaller nested segments of black, gold and fuschia, while the red gets broken into red and purple.

If I had more matches that could be assigned to ancestors, I would have even more nested levels. Of course, if I was using all of my chromosomes, not just 20, I would be able to go back further as well.

You can see that as we move further back in time, the bracketed areas assigned to each color become smaller and smaller, as do the actual segments as viewed on my DNAPainter chromosomes.

Segments Get Progressively Smaller

You can see in the pedigree chart and segment painting above that the segments we inherit from specific ancestors divide over time. As we move further and further back in our tree, the segments inherited from any specific ancestor get smaller and smaller too.

Dr. Paul Maier in the MyOrigins 3.0 White Paper provides this informative graphic that shows the reduction in segments and the number of ancestors whose DNA we carry reaching back in time.

I refer to this as a porcupine chart.

Eventually, we inherit no segments from red ancestors, and the pieces of DNA that we inherit from the distant blue ancestors become so small and fragmented that they cannot be positively identified as coming from a specific ancestor when compared to and matched with other people. That’s why vendors don’t show small segment matches, although different vendors utilize different segment thresholds.

The debate about how small is too small continues, but the answer is not simply segment size alone. There is no one-size-fits-all answer.

As segments become smaller, the probability, or chances that we match another person by chance (IBC) increases. Proof that someone shares a specific ancestor, especially when dealing with increasingly smaller segments is a function of multiple factors, such as tree completeness for both people, shared matches, parental match confirmation, and more. I wrote about What Constitutes Proof, here.

In the Family Finder Matching White Paper, Dr. Maier provides this chart reflecting IBD (Identical By Descent) and IBC (Identical By Chance) segments and the associated false positivity rate. That means how likely you are to match someone on a segment of that size by chance and NOT because you both share the DNA from a common ancestor.

I wrote Concepts: Identical by Descent, State, Population and Chance to help you better understand how this works.

In the chart below, I’ve combined the generations, relationships, # of ancestors, assuming no duplicates, birth year range based on an approximate 30-year generation, percent of DNA assuming exactly half of each ancestor’s DNA descends in each generation (which we know isn’t exactly accurate), and the average amount of total inherited cMs using that same assumption.

Note that beginning with the 7th generation, on average, we can expect to inherit less than 1% of the DNA of an ancestor, or approximately 55 total cM which may be inherited in multiple segments.

The amount of actual cMs inherited in each generation can vary widely and explains why, beginning with third cousins, some people won’t share DNA from a common ancestor above the various vendor matching thresholds. Yet, other cousins several generations removed will match. Inheritance is random.

Parallel Inheritance

In order to match someone else descended from that 11th generation ancestor, BOTH you AND your match will need to have inherited the exact SAME DNA segment, across 11 generations EACH in order to match. This means that 11 transmission events for each person will need to have taken place in parallel with that identical segment being passed from parent to child in each line. For 22 rolls of the genetic dice in a row, the same segment gets selected to be passed on.

You can see why we all need to work to prove that distant matches are valid.

The further back in time we work, the more factors we must take into consideration, and the more confirming proof is needed that a match with another individual is a result of a shared ancestor.

Having said that, shared distant matches ARE the key to breaking through brick-wall ancestors. We just need to be sure we are chasing the real deal and not a red herring.

Exciting Possibilities

The most exciting possibility is that some segments are actually passed intact for several generations, meaning those segments don’t divide into segments too small for matching.

For example, the 22 cM fuschia segment that tracks through generations 4 and 5 to Jacob Lentz and Fredericka Ruhle has been passed either intact or nearly intact to all of those people who stack up and match each other and me on that segment. 22 cM is definitely NOT a small segment and we know that it descended from either Jacob or Fredericka, or perhaps combined segments from each. In any case, if someone from the Lentz line in Germany tested and matched me on that segment (and by inference, the rest of these people too), we would know that segment descended to me from Jacob Lentz – or at least the part we match on if we don’t match on the entire segment.

This is exactly what nested segments are…breadcrumbs to ancestors.

Part of that 22cM segment could be descended from Jacob and part from Fredericka. Then of Jacob’s portion, for example, pieces could descend from both his mother and father.

This is why we track individual segments back in time to discern their origin.

The Promise of the Future

The promise of the future is when a group of other people triangulate on a reasonably sized segment AND know where it came from. When we match that triangulation group, their identified segment may well help break down our brick walls because we match all of them on that same segment.

It is exactly this technique that has helped me identify a Womack segment on my paternal line. I still haven’t identified our common ancestor, but I have confirmed that the Womacks and my Moore/Rice family interacted as neighbors 8 generations ago and likely settled together in Amelia county, migrating from eastern Virginia. In time, perhaps I’ll be able to identify the common Womack ancestor and the link into either my Moore or Rice lines.

I’m hoping for a similar breakthrough on my mother’s side for Philip Jacob Miller’s wife, Magdalena, 7 generations back in my tree. We know Magdalena was Brethren and where they lived when they took up housekeeping. We don’t know who her parents were. However, there are thousands of Miller descendants, so it’s possible that eventually, we will be able to break down that brick wall by using nested segments – ours and people who descend from Magdalena’s siblings, aunts, and uncles.

Whoever those people were, at least some of their descendants will likely match me and/or my cousins on at least one nested Miller segment that will be the same segment identified to their ancestors.

Genealogy is a team sport and solving puzzles using nested segments requires that someone out there is working on identifying triangulated segments that track to their common ancestors – which will be my ancestors too. I have my fingers crossed that someone is working on that triangulation group and I find them or they find me. Of course, I’m working to triangulate and identify my segments to specific ancestors – hoping for a meeting in the middle – that much-desired bridge to the past.

By the time you’ve run out of other records, nested segments are your last chance to identify those elusive ancestors. 

Do you have genealogical brick walls that nested segments could solve?

__________________________________________________________

Follow DNAexplain on Facebook, here or follow me on Twitter, here. You can also subscribe to receive emails when I publish articles by clicking the “Follow” button at www.DNAexplain.com.

You’re always welcome to forward articles or links to friends.

You Can Help Out and It’s Free

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Uploads

Genealogy Products and Services

My Book

Genealogy Books

Genealogy Research

23andMe Genetic Tree Provides Critical Clue to Solve 137-Year-Old Disappearance Mystery

DNA can convey messages from the great beyond – from times past and people that died long before we were born.

I had the most surprising experience this week. It began with receiving an email with the sender name of my long-time research buddy, cousin Garmon Estes.

It’s all the more surprising because not only did Garmon never own a computer, despite my ceaseless encouragement, he passed over in 2013 at the age of 85. So, imagine my shock to open my email to see a message from Garmon. Queue up spooky music😊

As it turned out, Garmon’s nephew is also Garmon. I had communicated with the family off and on over the years since the death of Garmon the elder. Garmon, the younger, had written to tell me that the second “great brick wall” that haunted his Uncle Garmon had fallen – and how that happened, thanks to DNA.

Garmon, the Elder

Estes Garmon

Garmon Estes, the elder

I first met Garmon the elder, via letter, back in the 1970s or maybe early 80s. He was an experienced genealogist and I was beginning.

At that time, Garmon had been chasing the identity of the father of our common ancestor, John R. Estes, for decades, and I was just embarking on what would become a lifelong adventure, or perhaps it could better be called an obsession.

John R. Estes had moved from some unknown location to Claiborne County, Tennessee with his wife and family about 1820. That’s pretty much all we knew at that time. Garmon had spent decades before the age of online records researching every John Estes he could find. I can’t even begin to tell you how many John Esteses existed that needed to be eliminated as candidates.

Garmon lived in California, far from Tennessee. I lived in Indiana, then Michigan – significantly closer. He began caring for his ill spouse, and I began traveling to dusty courthouses, sometimes reading musty books page by yellowed page, extracting everything Estes. Garmon worked from his local Family History Center when he could and wrote letters.

Between our joint sleuthing and many theories that we both composed and subsequently shot down, we narrowed John R. Estes’s location of origin to Halifax County, Virginia. However, there were multiple John Esteses living there at the same time, about the same age, none using middle initials reliably, and some not at all. How inconsiderate!

I began perusing every possible record. I had eliminated some Johns as candidates, most often because they clearly remained in the community after our John had moved to Claiborne County. Late one night, in our local family history center, I found that fateful clue – John R. Estes noted as (S.G.) short for “son of George,” on just one tax list. All it takes is that one gold-nugget record.

It was after 10 PM when I left the Family History Center and even later when I got home. I debated whether I should call Garmon or not, but I decided that indeed, he would want to know immediately, even if I did call at an inconvenient time or wake him up.

The discovery of John’s father, of course, opened the door for much more research, and it solved one of Garmon’s two brick walls that had haunted his genealogy life.

He never solved the second one, but it wasn’t for lack of trying.

What Happened to Willis Alexander Garmon Estes?

Willis Alexander Garmon Estes was born on December 21, 1854, in Lenoir, Roane County, TN. His nickname was Willie.

Willie married Martha Lee Mathis in 1874 and they had 4 children beginning with the first child born the next year in Roane County. Sometime between 1875 and the birth of the second child in 1877, they migrated to Greenwood, Wise County, Texas where their next two children were born in 1877 and 1881.

Martha was pregnant for their fourth child in 1883 when something very strange happened. Willie disappeared, and I do mean literally and completely. Just poof, gone.

Not sure what to do, Martha’s father, who lived in Missouri, went to Texas to retrieve his pregnant daughter and her children and took her and the children home to Missouri where their last child was born that September.

Willie was only 28 when he vanished. The family, of course, had many stories about what happened. Texas at that time was pretty much the “wild west” and the stories about Willie reflected exactly that.

Texas was sometimes the refuge of outlaws and shady characters. One story revealed that Willie had shot a man back in Tennessee and the family fled to Louisiana, then Texas. Of course, that doesn’t tell us why he disappeared in Texas, but it opens the door to speculation and casts doubt on his character, perhaps.

Another story was that he was shot by Indians.

A third story stated that Willie settled in Indian Territory north of the Red River, now Oklahoma, and that he had an altercation with an Indian over the supposed theft of firewood, although who was accusing who was unclear. Willie shot the Indian, then had to flee for his life, leaving his pregnant wife and children as a posse of Indian Police surrounded his house. Willie supposedly promised Martha that he would return, but never did. It was reported that he was shot in Mexico, but no further details emerged.

Aren’t these just maddeningly vague???

Yet another story was that Willie headed for the goldfields of California, struck it rich, and was murdered on the way back home. The details varied, but one version had him murdered by a traveling companion on the trail. Another had him becoming ill and dying in a hospital in St. Louis where his wife went to search for him, to no avail. That might explain why she went back to Missouri, Garmon postulated. And yet a third version was some hybrid of the two where “someone” tried to find Willie’s family for years to reveal what had happened, and where, but was never successful. Of course, how did the family know about this if the mystery person was unable to find the family? But I digress.

Garmon desperately wanted to solve that mystery. He wanted closure.

I didn’t realize that the genealogy bug had bitten Garmon’s nephew too, but it clearly has. Garmon would be so proud.

With Garmon the younger’s permission, I’m publishing “the rest of the story,” Connecting the Dots, as written by Garmon the younger, with a few technical interjections from me involving DNA from time to time.

Connecting the Dots

In 2015, My dad Richard Estes, my brother Corey Estes, and I took a trip to Texas and Oklahoma to see if we could find out more about Willis Alexander Garmon Estes’ disappearance.

Estes greenwood

We visited Greenwood, Texas and nearby Decatur where we looked at historical records at the Wise County Clerk Office. We also went up to Oklahoma City to see the state archives and to Tishomingo to look at any records that might be available.

Estes Oklahoma history.png

Interestingly enough, we did not find any clues as to the disappearance of Willis Alexander Garmon Estes. There were no newspaper articles or criminal records concerning any incidents with Willis Alexander Garmon Estes. The only new information that we found was a couple of land deeds showing that Willis Alexander Garmon Estes’ brother Fielding had bought and sold land in Wise County during the time that Willis Alexander Garmon Estes was living in Greenwood.

We left empty-handed on our trip but our curiosity remained strong and we began talking to each other about going on another trip to Tennessee to speak with Estes family members in Loudon County to see if they might know something about Willis Alexander Garmon’s disappearance.

DNA Testing

In December of 2018, my wife, children, and I had our DNA tested using the service 23andMe. We received test results within a month of sending in saliva samples. The results did not reveal anything unusual.

Fast forward to October 2019. 23andMe introduced a new Family Tree feature that automatically creates a family tree based on the DNA results that you share with relatives in 23andMe. This was a fascinating feature and I noticed that all of my family members were automatically placed into the correct position on the family tree without me having to do anything.

[Roberta’s note – this is not always the case, so don’t necessarily expect the same level of accuracy. The tree is a wonderful innovative feature, just treat family placement as hints and not facts.]

Every few weeks as more and more people had their DNA tested on 23andMe, new relatives were added to the family tree.

In February 2020, I noticed something interesting under the location of Willis Alexander Garmon Estes on the family tree. A woman by the name of Edna appeared as a descendent of Willis Alexander Garmon Estes. The first thing I did was to try and get in contact with her on 23andMe. No luck. Next, I thought maybe she was the descendent of one of Willis Alexander Garmon’s sons (James, John, or George). However, after researching the descendants of each of those lines, Edna’s name did not appear.

The next step I took was to look up as many Ednas by that last name on ancestry.com as I could find and trace their ancestry back to see where it led.

There were two Ednas by that last name in the United States whose age matched the one on 23andMe. I traced both of their ancestry lines back to the 1800’s. Neither one had Willis Alexander Garmon Estes as an ancestor.

Breakthrough

During the middle of March 2020, when I was quarantined at home from work due to the COVID-19 virus, I took another look at Edna’s family lines. I noticed there was a gentleman by the name of James Henry Houston mentioned as an ancestor.

The interesting thing about James was that he was born on the same day, same year, and in the same county as Willis Alexander Garmon Estes. James Henry Houston was born on December 26, 1854 in Loudon County, Tennessee. This seemed like possibly more than a coincidence, so I dived into the data a little bit more.

I looked at federal census records to find out more about James Henry Houston’s past. Strangely there were no official records of him until May 12, 1889 when he married Allie Ona Taylor in Erath, Texas. Normally, if someone is born in 1854, they would show up in one of the federal census records of 1860, 1870, or 1880. James Henry Houston does not show up in any official federal census records until 1900.

According to ancestry records, James Henry Houston married Allie Ona Taylor in 1889 and resided in the Hood County region of Texas until 1910. During this time, he raised 8 children with his wife Allie.

In 1920, the federal census placed him and Allie in Whitehall, Montana. The last federal census he appears in is 1930. He lived in Pomona, California where he died in 1933 at the age of 78.

At this point, I thought it was highly likely that James Henry Houston and Willis Alexander Garmon Estes were the same person. If my hunch was correct then a photo of James Henry Houston would most likely show a resemblance to his son, my great grandfather John Alexander Estes.

Estes James Henry Houston

The photos above show a remarkable similarity in the eyes, nose, mouth, and facial structure between the two men. To me, the photo and historical evidence is enough to conclude that Willis Alexander Garmon Estes is James Henry Houston.

Garmon’s Concluding Thoughts

As I reflect on the fact that Willis Alexander Garmon Estes renamed himself James Henry Houston and moved from Wise County down to Hood County, Texas – approximately 60 miles distance to marry and raise a new family, many more questions come to mind.

What exactly happened to cause Willis Alexander Garmon Estes to leave his wife and children behind? Was it simply a marital dispute or did it involve a criminal offense and running from the law as was mentioned in the family lore?

Did my great grandfather know that his father lived in Pomona in 1930, which was only 6 miles away from where he was living in Rancho Cucamonga? Were there other family members that knew what happened but promised not to tell anyone else? We may never know.

Finally, I want to add one more piece to the story that I found fascinating. On ancestry.com, many of the family trees for James Henry Houston state that the mother and father of James Henry Houston was Jennie Bray and Henry Houston. No information is given for their birthdates or where they came from. The mother and father of Willis Alexander Garmon Estes was Jennie McVey and William Estes. The names Jennie Bray and Jennie McVey are very similar. In order to hide his true identity, James Henry Houston would have to make up a surname for his father since he called himself Houston, not Estes. Willis Alexander Garmon Estes had a brother named John Houston Estes. This might explain why James Henry Houston chose to use the surname Houston rather than another name.

Congratulations Garmon

I know this made Garmon the elder puff up with pride for Garmon the younger’s sleuthing skills and leap for joy at the solve. Garmon, the elder, had two main genealogy goals throughout his entire life. One was solved while he was living, but it took another generation to solve this one.

Great job, Garmon!

About the 23andMe Genetic Tree

23andMe is the only vendor to construct a “trial balloon” genetic tree based only on how the tester matches people and how they do, or don’t, match each other. This occurs with no input from testers in the form of genealogical trees of identifying how people are related to the tester.

Family Tree DNA has Phased Family Matching, MyHeritage has Theories of Family Relativity, and Ancestry has ThruLines which all do some sort of DNA+tree+relationship connectivity, but since 23andMe does not support user-created or uploaded trees, anything they produce has to be using DNA alone.

On one hand, it’s frustrating for genealogists, but on the other hand, there is sometimes a benefit to a different “all genetic” approach.

Of course, the only information that 23andMe has to utilize unless your parents have tested is how closely you match your matches and how closely your matches match each other. This allows 23andMe to place your matches at least in a “neighborhood” on your tree, at least approximately accurate, unless your parents are related to each other and that shared DNA causes things to get dicey quickly.

I wrote about 23andMe’s new relationship triangulation tree when it was first introduced in September 2019, nearly a year ago, here. The launch was rocky for a number of reasons, and if you’ve done genealogy for a long time, your research goals are likely to be further back in time than this 4 generation relationship tree will reveal.

23andMe tree

Click to enlarge

This is what my relationship tree looked like at the time the function was launched. You’ll note that 23andMe places relationships back in time 4 generations, to your great-great-grandparents, meaning that you might have 3rd or even 4th cousins showing up on your genetic tree.

I initially had a total of 18 people placed on my tree, with 3 being close family, 4 being accurate, 4 unknown, 1 uncertain and 6, or one third, inaccurate.

Keep in mind that 23andMe doesn’t make any provision to accommodate or take into account half-relationships, like half-brother or half-sister, either currently or historically. Therefore, descendant placement predictions can be “off” because half-siblings only carry the DNA from one common parent, instead of two, making those relationships appear more distant than they really are.

In Garmon’s case, his great-great-grandfather is the ancestor who was MIA, so the genetic tree has the potential to work well for this purpose.

Estes 23andme tree today

click to enlarge

Today, my tree looks somewhat different, with only 14 people displayed instead of 18, and 6 waiting in the wings to see if I can help 23andMe figure out how and where to place them.

Since the initial launch, customers have been given the opportunity to add their ancestors’ names to their nodes. This works just fine so long as nobody married more than once and had children from both marriages.

Estes Willie Alexander today

click to enlarge

 

Here’s a closer image of the left-hand side of my tree where I’ve super-imposed the location of Willis Alexander Garmon Estes and Edna, as they are related to Garmon the Younger, at bottom right. Ignore the other names – I only utilized my own tree for an example tree structure.

One more generation and it’s unlikely that 23andMe would have made the connection between Edna and Garmon the younger.

Not only does this illustrate the perfect reason to test the oldest generations in your family, but also never to ignore an unknown match that seems to be within the past 3 or 4 generations. You never know what mysteries you might unravel.

Four generations actually reaches back in time quite substantially. In my case, my great-great-grandparents were born in 1805, 1810, 1812, 1813, 1815, 1816, 1818 (2), 1820, 1822, 1827, 1829, 1830, 1832, 1841 and 1848.

If you have mysteries within your closest 4 generations to unravel, the genetic tree at 23andMe might provide valuable clues, but only if you’re willing to do the requisite work to figure out HOW these people match you.

You can’t transfer your DNA file TO 23andMe, so if you want to have your results in the 23andMe database, you’ll need to test there.

Acknowledgments: Thank you to Garmon Estes, the younger, for generously sharing this story and allowing publication. My heart was warmed to see your generational research trip.

Thank you to Garmon Estes, the elder, for being my research partner for so many years. You can finally RIP now, although somehow I suspect you already have these answers.

_____________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Products and Services

Genealogy Research