Genetic Affairs has created a new version of AutoKinship at GEDmatch. The new AutoKinship report adds new features, allows for more kits to be included in the analysis, and integrates multiple reports together:
- AutoCluster – the autoclusters we all know and love
- AutoSegment – clusters based on segments
- AutoTree – reconstructed tree based on GEDCOM files of you and your matches, even if you don’t have a tree
- AutoKinship – the original AutoKinship report provided genetic trees. The new AutoKinship report includes AutoTree, combines both, and adds features called AutoKinship Tree. (Trust me on this one – you’ll see in a minute!)
- Common Ancestors with your ancestors
- Common Ancestors between matches, even if they don’t match your tree
- Common Locations
Maybe the best news is that some reports provide automatic triangulation because, at GEDmatch, it’s possible to not only see how you match multiple people, but also if those people match each other on that same segment. Of course, triangulation requires three-way matching in addition to the identification of common ancestors which is part of what AutoKinship provides, in multiple ways.
Let’s step through the included reports and features one at a time, using my clusters as an example.
Order Your Report
As a Tier 1 GEDmatch customer, sign in, select AutoKinship and order your report.
Note that there are now two clustering settings, the default setting and one that will provide more dense clusters. The last setting is the default setting for AutoKinship, since it has been shown to produce better AutoKinship results.
You can also select the number of kits to consider. Since this tool is free with a GEDmatch Tier 1 subscription, you can start small and rerun if you wish, as often as you wish.
Currently, a maximum of 500 matches can be included, but that will be increased to 1000 in the future. Your top 500 matches will be included that fall within the cM matching parameters specified.
I’m leaving this at the maximum 400 cM threshold, so every match below that is included. I generally leave this default threshold because otherwise my closest matches will be in a huge number of clusters which may cause processing issues.
For a special use case where you will want to increase the cM threshold, see the Special Use Cases section near the end of this article.
You can select a low number of matches, like 25 or 50 which is particularly useful if you want to examine the closest matches of a kit without a tree.
Keep in mind that there is currently a maximum processing time of 10 minutes allowed per report. This means that if you have large clusters, which are the last ones processed, you may not have AutoKinship results for those clusters.
This also means that if you select a high cM threshold and include all 500 allowable matches, you will receive the report but the AutoKinship results may not be complete.
When finished, your report will be delivered to you as a download link with an attached zipped file which you will need to save someplace where you can find it.
If you’re a PC user, you’ll need to unzip or extract the files before you can use the files. You’ll see the zipper on the file.
If you don’t extract the contents, you can click on the file to open which will display a list of the files, so it looks like the files are extracted, but they aren’t.
You can see that the file is still zipped.
You can click on the html file which will display the AutoCluster correctly too, but when you click on any other link within that file, you’ll receive this error message if the file is still zipped.
If this happens to you, it means the file is still zipped. Close the files you have open, right click on the yellow zipped file folder and “extract all.”
Then click on the HTML link again and everything should work.
Ok, on to the fun part – the tools.
I’ve written about most of these tools individually before, except for the new combinations of course. I’ve put all of the Genetic Affairs Tools, Instructions and Resources in one article that you can find here.
I recommend that you take a look to be sure you’re using each tool to its greatest advantage.
Click on the html file and watch your AutoCluster fly into place. I always, always love this part.
The first thing I noticed about my AutoCluster at GEDmatch is that it’s HUGE! I have a total of 144 clusters and that’s just amazing!
Information about the cluster file, including the number of matches, maximum and minimum cM used for the report, and minimum cluster size appears beneath your cluster chart.
22 people met the criteria but didn’t have other matches that did, so they are listed for my review, but not included in the cluster chart.
At first glance, the clusters look small, but don’t despair, they really aren’t.
My clusters only look small because the tool was VERY successful, and I have many matches in my clusters. The chart has to be scaled to be able to display on a computer monitor.
Genetic Affairs has introduced a new layout for the various included tools.
Each section opens to provide a brief description of the tool and what is occurring. This new tool includes four previous tools plus a new one, AutoCluster Tree, as follows:
AutoCluster first organizes your DNA matches into shared match clusters that likely represent branches of your family. Everyone in a cluster will likely be on the same ancestral line, although the MRCA between any of the matches and between you and any match may vary. The generational level of the clusters may vary as well. One may be your paternal grandmother’s branch, another may be your paternal grandfather’s father’s branch.
AutoSegment organizes your matches based on triangulating segments. AutoSegment employs the positional information of segments (chromosome and start and stop position) to identify overlapping segments in order to link DNA matches. In addition, triangulated data is used to collaborate these links. Using the user defined minimum overlap of a DNA segment we perform a clustering of overlapping DNA segments to identify segment clusters. The overlap is calculated in centimorgans using human genetic recombination maps. Another aspect of overlapping segments is the fact that some regions of our genome seem to have more matches as compared to the other regions. These so-called pile-up areas can influence the clustering. The removal of known pile-up regions based on the paper of Li et al 2014 is optional and is not performed for this analysis However, a pileup report is provided that allows you to examine your genome for pileup regions.
By comparing the tree of the tested person and the trees from the members of a certain cluster, we can identify ancestors that are common amongst those trees. First, we collect the surnames that are present in the trees and create a network using the similarity between surnames. Next, we perform a clustering on this network to identify clusters of similar surnames. A similar clustering is performed based on a network using the first names of members of each surname cluster. Our last clustering uses the birth and death years of members of a cluster to find similar persons. As a consequence, initially large clusters (based on the surnames) are divided up into smaller clusters using the first name and birth/death year clustering.
AutoKinship automatically predicts family trees based on the amount of DNA your DNA matches share with you and each other. Note that AutoKinship does not require any known genealogical trees from your DNA matches. Instead, AutoKinship looks at the predicted relationships between your DNA matches, and calculates many different paths you could all be related to each other. The probabilities used by this AutoKinship analysis are based on simulated data for GEDmatch matches and are kindly provided by Brit Nicholson (methodology described here). Based on the shared cM data between shared matches, we create different trees based on the putative relationships. We then use the probabilities to test every scenario which are then ranked.
Predicted trees from the AutoTree analysis are based on genealogical trees shared by the DNA matches and, if available, shared by the tested person. The relationships between DNA matches based on their common ancestors as provided AutoTree are used to perform an AutoKinship analysis and are overlayed on the predicted AutoKinship tree.
AutoKinship Tree is New
AutoKinship Tree is the new feature that combines the features of both AutoTree and AutoKinship. You receive:
- Common ancestors between you and your matches
- Trees of people who don’t share your common ancestors but share ancestors with each other
- Combined with relationship predictions and
- A segment analysis
Of course, the relative success of the tree tools depends upon how many people have uploaded GEDCOM files.
Big hint, if you haven’t uploaded your family tree, do so now. If you are an adoptee or searching for a parent and don’t know who your ancestors are, AutoKinship Tree does its best without your tree information, and you will still benefit from the trees of others combined with predicted relationships based on DNA.
It’s easier to show you than to tell you, so let’s step through my results one section at a time.
I’m going to be using cluster 5 which has 32 members and cluster 136 which has 8 members. Ironically, cluster 136 is a much more useful cluster, with 8 good matches, than cluster 5 which includes 32 people.
Results of the AutoKinship Analyses
As you scroll down your results, you’ll see a grid beneath the Explanation area.
It’s easy to see which cluster received results for each tool. My cluster 5 has results in each category, along with surnames. (Notice that you can search for surnames which displays only the clusters that contain that surname.)
I can click on each icon to see what’s there waiting for me.
Additionally, you can click at the top on the blue middle “here” for an overview of all common ancestors. Who can resist that, right?
Click on the ancestor’s name or the tree link to view more information.
You can also view common locations too by clicking on the blue “here” at far right. A location, all by itself, is a HUGE hint.
Clicking on the tree link shows you the tree of the tester with ancestors at that location. I had several others from North Carolina, generally, and other locations specifically. Let’s take a look at a few examples.
Common Ancestor Clusters
Click on the first blue link to view all common ancestors.
Common Ancestor Clusters summarize all of the clusters by ancestor. In other words, if any of your matches have ancestors in common in their tree, they are listed here.
These clusters include NOT just the people who share ancestors in a tree with you, but who also share known ancestors with each other BUT NOT YOU. That may be incredibly important when you are trying to identify your ancestors – as in brick walls. Your ancestors may be their ancestors too, or your common segments might lead to your common ancestors if you complete their tree.
There are other important hints too.
In my case, above, Jacob Lentz is my known ancestor.
However, Sarah Barron is not my ancestor, nor is John Vincent Dodson. They are the descendants of my Dodson ancestor though. I recognized that surname and those people. In other instances, recognizing a common geography may be your clue for figuring out how you connect.
In the cluster column at left, you can see the cluster number in which these people are found.
Common Locations Table
Clicking on the second link provides a Common Location Table
Some locations are general, like a state, and others are town, county or even village names. Whatever people have included in their GEDCOM files that can be connected.
Looking at this first entry, I recognize some of the ancestral surnames of Karen’s ancestors. The fact that we are found in the same cluster and share DNA indicates a common ancestor someplace.
Check for this same person in additional locations, then, look at their tree.
Ok, back to the AutoKinship Analysis Table and Cluster 136.
I’m going to use Cluster 136 as an example because this cluster has generated great reports using all of the tools, indicated by the icon under each column heading. Some clusters won’t have enough information for everything so the tools generate as much as possible.
Scrolling down to Cluster 136 in the AutoCluster Information report, just beneath the list of clusters, I can see my 8 matches in that cluster.
Of course, I can click on the links for specific information, or contact them via email. At the end of this article in the “Tell Me Everything” section, I’ll provide a way to retrieve as much information as possible about any one match. For now, let’s move to the AutoTree.
Cluster 136 AutoTree
Clicking on the icon under AutoTree shows me how two of the matches in this cluster are related to each other and myself.
Note that the centimorgan badges listed refer to the number of cM that I share with each of these people, not how much they share with each other.
Click on any of the people to see additional information.
When I click on J Lentz m F Moselman, a popup box shows me how this couple is related to me and my matches.
Of course, you can also view the Y DNA or mitochondrial DNA haplogroups if the testers have provided that information when they set up their GEDmatch profile information.
Just click on the little icons.
If the testers have not provided that information, you can always check at FamilyTreeDNA or 23andMe, if they have tested at either of those vendors, to view their haplogroup information.
Today, GEDmatch kit numbers are assigned randomly, but in the early days, before Genesis, the leading letter of A meant AncestryDNA, F or T for FamilyTreeDNA, M for 23andMe and H for MyHeritage. If the kit number is something else, perform a one-to-one or a one-to-many report which will display the source of their DNA file.
The small number, 136 in this case, beside the cM number indicates the cluster or clusters that these people are members of. Some people are members of multiple clusters
Let’s see what’s next.
Cluster 136 Common Ancestors
Clicking on the Ancestors icon provides a report that shows all of the Ancestor Clusters in cluster 136.
The difference between this ancestor chart and the larger chart is that this only shows ancestors for cluster 136, while the larger chart shows ancestors for the entire AutoCluster report.
Cluster 136 Locations
All of the locations shown are included in trees of people who cluster together in cluster 136. Of course, this does NOT mean that these locations are all relevant to cluster 136. However, finding my own tree listed might provide an important clue.
Using the location tool, I discover 5 separate location clusters. This location cluster includes me with each tester’s ancestors who are found in Montgomery County, Ohio.
The difference between this chart for cluster 136 only and the larger location chart is that every location in this chart is relevant for people who all cluster together meaning we all share some ancestral line.
Viewing the trees of other people in the cluster may suggest ancestors or locations that are essential for breaking down brick walls.
Cluster 136 AutoKinship
Clicking on the anchor in the AutoKinship column provides a genetically reconstructed tree based on how closely each of the people match me, and each other. Clearly, in order to be able to provide this prediction, information about how your matches also match each other, or don’t, is required.
Again, the cM amount shown is the cM match with me, not with each other. However, if you click on a match, a popup will be shown that shows the shared cM between that person and the other matches as well as the relationship prediction between them in this tree
So, Bill matches David with a total of 354.3 cM and they are positioned as first cousins once removed in this tree. The probability of the match being a 1C1R (first cousin once removed) is 64.9%, meaning of course that other relationships are possible.
Note that Bill and David ALSO share a segment with me in autosegment cluster 185, on chromosome 3.
It’s important to note that while 136 is the autocluster number, meaning that colored block on the report, WITHIN clusters, autosegment clusters are formed and numbered.
Each autosegment cluster receives its own number and the numbers are for the entire report. You will have more autosegment clusters than autoclusters, because at least some of the colorful autoclusters will contain more than one segment cluster.
Remember, autoclusters are those colorful boxes of matches that fly into place. Autosegment clusters are the matching triangulated clusters on chromosomes and they are represented by the blue bars, shown below.
AutoCluster 136 contains 5 different autosegment clusters, but Bill is only included in one of those autosegment clusters.
You’ll notice that there are some people, like Robin at the bottom, who do match some other people in the cluster, but either not enough people, or not enough overlapping DNA to be included as an autocluster member.
The small colored chromosomes with numbers, boxed in red, indicate the chromosome on which this person matches me.
If you click on that chromosome icon, you’ll see a popup detailing everyone who matches me on that segment.
Note that in some cases a member of a segment cluster, like Robin, did not make it in the AutoCluster cluster. You can spot these occurrences by scrolling down and looking at the cluster column which will then be empty for that particular match.
Reconstructed AutoKinship Trees in Most Likely Order
Scrolling down the page, next we see that we have multiple possible trees to view. We are shown the most likely tree first.
Tree likelihood is constructed based on the combined probability of my matching cM to an individual plus their likely relationship to each other based on the amount of DNA they share with each other as well.
In my case, all of the first 8 trees are equally as likely to be accurate, based on autosomal genetic relationships only. The ninth tree is only very slightly less likely to be accurate.
The X chromosome is not utilized separately in this analysis, nor are Y or mitochondrial DNA haplogroups if provided.
DNA Relationship Matrix
Continuing to scroll down, we next see the DNA matrix that shows relationships for cluster 5 in a grid format. Click on “Download Relationship Matrix” to view in a spreadsheet.
Keep scrolling for the next view which is the Individual Segment Cluster Information
Individual Segment Cluster Information
Remember that we are still focused on only one cluster – in this case, cluster 136. Each cluster contains people who all match at least some subset of other people in the cluster. Some people will match each other and the tested person on the same chromosome segment, and some won’t. What we generally see within clusters are “subclusters” of people who match each other on different chromosomes and segments. Also, some matches from cluster 136 might match other people but those matches might not be a member of cluster 136.
In autocluster 136, I have 14 DNA segments that converge into 5 segment clusters with my matches. Here’s segment cluster 185 that consists of two people in addition to me. Note that for individuals to be included in these segment clusters at GEDmatch, they must triangulate with people in the same segment cluster.
From left to right, we see the following information:
- AutoCluster number 136, shown below
- Segment cluster 185. This is a segment cluster within autocluster 136.
- Segment cluster 185 occurs on chromosome 3, between the designated start and stop locations.
- The segment representation shows the overlapping portions of the two matches, to me. You can easily see that they overlap almost exactly with each other as well.
- The SNP count is shown, followed by the name and cM count.
Cluster 136 AutoKinship Tree
The AutoKinship Tree column is different from the AutoKinship column in one fundamental way. The new AutoKinship Tree feature combines the genealogical AutoTree and the genetic AutoKinship output together in one report.
You can see that the “prior” genealogical tree information that one of my matches also descends from Jacob Lentz (and wife, if you click further) has now been included. The matches without trees have been reconstructed around the known genealogy based on how they match me and each other.
I was already aware of how I’m related to Bill, David, *C and *R, but I don’t know how I am related to these other people. Based on their kit identifier, I can go to the vendor where they tested and utilize tools there, and I can check to see if they have uploaded their DNA files elsewhere to discover additional records information or critical matches. Now at least I know where in the tree to search.
Cluster 136 AutoSegment
Clicking on AutoSegment provides you with segment information. Each cluster is painted on your chromosomes.
By hovering over the darkly colored segments, which are segment clusters, you can view who you match, although to view multiple matches, continue scrolling.
In the next section, you’ll see the two segment clusters contained wholly within cluster 136.
Following that is the same information for segment clusters partially linked to cluster 136, but not contained wholly within 136.
Bonus – Tell Me Everything – Individual Match Clusters
We’ve focused specifically on the AutoKinship tools, but if you’re interested in “everything” about one specific match, you can approach things from that perspective too. I often look at a cluster, then focus on individuals, beginning with those I can identify which focuses my search.
If you click on any person in your match list, you’ll receive a report focusing on that person in your autocluster.
Let’s use cousin Bill as an example. I know how he’s related to me.
You can choose to display your chosen cluster by:
- Number of shared matches
- Shared cM with the tester
I would suggest experimenting with all of the options and see which one displays information that is most useful to the question you’re trying to answer.
Beneath the cluster for Bill, you’ll see the relevant information about the cluster itself. Bill has cluster matches on two different chromosomes.
The AutoCluster Cluster member Information report shows you how much DNA each cluster member shares with the tested person, which is me, and with each other cluster member. It’s easy to see at a glance who Bill is most closely related to by the number of cMs shared.
Only one of Bill’s chromosomes, #3, is included in clusters, but this tells me immediately that this/these segments on chromosome 3 triangulate between me, Bill, and at least one other person.
Segments shown in orange (chromosome 22) match me, but are not included in a cluster.
Special Use Cases – Unknown People
For adoptees and people trying to figure out how they are related to closer relatives, especially those without a tree, this new combined AutoKinship tool is wonderful.
400 cM is the upper default limit when running the report, meaning that close family members will not be included because they would be included in many clusters. However, you can make a different selection. If you’re trying to determine how several closely related people intersect, select a high threshold to include everyone.
Select a lower number of matches, like 25 or 50.
In this example, ‘no limit” was selected as the upper total match threshold and 25 closest matches.
AutoKinship then constructs a genetic tree and tells you which trees are possible and most likely. If some people do have trees, that common ancestor information would be included as well.
Note that when matches occur over the 400 cM threshold, there will be too many common chromosome matches so the chromosome numbers are omitted. Just check the other reports.
This tool would have helped a great deal with a recent close match who didn’t know how they are related to my family.
You can see this methodology in action and judge its accuracy by reconstructing your own family, assuming some of your known family members have uploaded to GEDmatch. Try it out.
It’s a Lot!
I know there’s a lot here to absorb, but take your time and refer back to this article as needed.
This flexible new tool combines DNA matching, genealogy trees, genetic trees, locations, autoclusters, a chromosome browser, and triangulation. It took me a few passes and working with different clusters to understand and absorb the information that is being provided.
For people who don’t know who their parents or close relatives are, these tools are amazing. Not only can they determine who they are related to, and who is related to each other, but with the use of trees, they can view common ancestors which provides possible ancestors for them too.
For people painting their triangulated segments at DNAPainter, AutoKinship provides triangulation groups that can be automatically painted using the Cluster Auto Painter, here, plus helps to identify that common ancestor. You can read more about DNAPainter, here.
For people seeking to break down brick walls, AutoKinship Tree provides assistance by providing tree matching between your matches for common ancestors NOT IN YOUR TREE, but that ARE in theirs. Your brick walls are clearly not (yet) identified in your tree, although that’s our fervent hope, right?
Even if your matches’ trees don’t go far enough back, as a genealogist, you can extend those trees further to hopefully reveal a previously unknown common ancestor.
The Best Things You Can Do
Aside from DNA testing, the three best things you can do to help yourself, and your clusters are:
- Upload your GEDCOM file, complete with locations, so you have readily available trees. Ask your matches to do so as well. Trees help you and others too.
- Encourage people you match at Ancestry who provides no chromosome segment information or chromosome browser to upload a copy of their DNA files and tree.
- Test your family members and cousins, and encourage them to upload their DNA and their trees. Offer to assist them. You can find step-by-step download/upload instructions here.
Sign Up Now – It’s Free!
If you enjoyed this article, subscribe to DNAeXplain for free to automatically receive new articles by email each week.
Here’s the link. Just look for the little grey “follow” button on the right-hand side on your computer screen below the black title bar, enter your e-mail address, and you’re good to go!
In case you were wondering, I never have nor ever will share or use your e-mail outside of the intended purpose.
Share the Love
You can always forward these articles to friends or share by posting links on social media. Who do you know that might be interested?
Follow DNAexplain on Facebook, here or follow me on Twitter, here.
Share the Love!
You’re always welcome to forward articles or links to friends and share on social media.
If you haven’t already subscribed (it’s free,) you can receive an email whenever I publish by clicking the “follow” button on the main blog page, here.
You Can Help Keep This Blog Free
I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.
Thank you so much.
DNA Purchases and Free Uploads
Genealogy Products and Services