Genetic Affairs: AutoPedigree Combines AutoTree with WATO to Identify Your Potential Tree Locations

If you’re an adoptee or searching for an unknown parent or ancestor, AutoPedigree is just what you’ve been waiting for.

By now, we’re all familiar with Genetic Affairs who launched in 2018 with their signature autocluster tool. AutoCluster groups your matches into clusters by who your matches match with each other, in addition to you.

browser autocluster

A year later, in December 2019, Genetic Affairs introduced AutoTree, automated tree reconstruction based on your matches trees at Ancestry and Family Finder at Family Tree DNA, even if you don’t have a tree.

Now, Genetic Affairs has introduced AutoPedigree, a combination of the AutoTree reconstruction technology combined with WATO, What Are the Odds, as seen here at DNAPainter. WATO is a statistical probability technique developed by the DNAGeek that allows users to review possible positions in a tree for where they best fit.

Here’s the progressive functionality of how the three Genetic Affairs tools, combined, function:

  • AutoCluster groups people based on if they match you and each other
  • AutoTree finds common ancestors for trees from each cluster
  • Next, AutoTree finds the trees of all matches combined, including from trees of your DNA matches not in clusters
  • AutoPedigree checks to see if a common ancestor tree meets the minimum requirement which is (at least) 3 matches of greater to or equal to 30-40 cM. If yes, an AutoPedigree with hypotheses is created based on the common ancestor of the matching people.
  • Combined AutoPedigrees then reviews all AutoTrees and AutoPedigrees that have common ancestors and combine them into larger trees.

Let’s look at examples, beginning with DNAPainter who first implemented a form of WATO.

DNA Painter

Let’s say you’re trying to figure out how you’re related to a group of people who descend from a specific ancestral couple. This is particularly useful for someone seeking unknown parents or other unknown relationships.

DNA tools are always from the perspective of the tester, the person whose kit is being utilized.

At DNAPainter, you manually create the pedigree chart beginning with a common couple and creating branches to all of their descendants that you match.

This example at DNAPainter shows the matches with their cM amounts in yellow boxes.

xAutoPedigree DNAPainter WATO2

The tester doesn’t know where they fit in this pedigree chart, so they add other known lines and create hypothesis placeholder possibilities in light blue.

In other words, if you’re searching for your mother and you were born in 1970, you know that your mother was likely born between 1925 (if she was 45 when she gave birth to you) and 1955 (if she was 15 when she gave birth to you.) Therefore, in the family you create, you’d search for parents who could have given birth to children during those years and create hypothetical children in those tree locations.

The WATO tool then utilizes the combination of expected cMs at that position to create scores for each hypothesis position based on how closely or distantly you match other members of that extended family.

The Shared cM Project, created and recently updated by Blaine Bettinger is used as the foundation for the expected centimorgan (cM) ranges of each relationship. DNAPainter has automated the possible relationships for any given matching cM amount, here.

In the graphic above, you can see that the best hypothesis is #2 with a score of 1, followed by #4 and #5 with scores of 3 each. Hypothesis 1 has a score of 63.8979 and hypothesis 3 has a score of 383.

You’ll need to scroll to the bottom to determine which of the various hypothesis are the more likely.

Autopedigree DNAPainter calculated probability

Using DNAPainter’s WATO implementation requires you to create the pedigree tree to test the hypothesis. The benefit of this is that you can construct the actual pedigree as known based on genealogical research. The down-side, of course, is that you have to do the research to current in each line to be able to create the pedigree accurately, and that’s a long and sometimes difficult manual process.

Genetic Affairs and WATO

Genetic Affairs takes a different approach to WATO. Genetic Affairs removes the need for hand entry by scanning your matches at Ancestry and Family Tree DNA, automatically creating pedigrees based on your matches’ trees. In addition, Genetic Affairs automatically creates multiple hypotheses. You may need to utilize both approaches, meaning Genetic Affairs and DNAPainter, depending on who has tested, tree completeness at the vendors, and other factors.

The great news is that you can import the Genetic Affairs reconstructed trees into DNAPainter’s WATO tool instead of creating the pedigrees from scratch. Of course, Genetic Affairs can only use the trees someone has entered. You, on the other hand, can create a more complete tree at DNAPainter.

Combining the two tools leverages the unique and best features of both.

Genetic Affairs AutoPedigree Options

Recently, Genetic Affairs released AutoPedigree, their new tool that utilizes the reconstructed AutoTrees+WATO to place the tester in the most likely region or locations in the reconstructed tree.

Let’s take a look at an example. I’m using my own kit to see what kind of results and hypotheses exist for where I fit in the tree reconstructed from my matches and their trees.

If you actually do have a tree, the AutoTree portion will simply be counted as an equal tree to everyone else’s trees, but AutoPedigree will ignore your tree, creating hypotheses as if it doesn’t exist. That’s great for adoptees who may have hypothetical trees in progress, because that tree is disregarded.

First, sign on to your account at Genetic Affairs and select the AutoPedigree option for either Ancestry or Family Tree DNA which reconstructs trees and generates hypotheses automatically. For AutoPedigree construction, you cannot combine the results from Ancestry and FamilyTreeDNA like you can when reconstructing trees alone. You’ll need to do an AutoPedigree run for each vendor. The good news is that while Ancestry has more testers and matches, FamilyTreeDNA has many testers stretching back 20 years or so in the past who passed away before testing became available at Ancestry. Often, their testers reach back a generation or two further. You can easily transfer Ancestry (and other) results to Family Tree DNA for free to obtain more matches – step-by-step instructions here.

At Genetic Affairs, you should also consider including half-relations, especially if you are dealing with an unknown parent situation. Selecting half-relationships generates very large trees, so you might want to do the first run without, then a second run with half relationships selected.

AutoPedigree options

Results

I ran the program and opened the resulting email with the zip file. Saving that file automatically unzips for me, displaying the following 5 files and folders.

Autopedigree cluster

Clicking on the AutoCluster HTML link reveals the now-familiar clusters, shown below.

Autopedigree clusters

I have a total of 26 clusters, only partially shown above. My first peach cluster and my 9th blue cluster are huge.

Autopedigree 26 clusters

That’s great news because it means that I have a lot to work with.

autopedigree folder

Next, you’ll want to click to open your AutoPedigree folder.

For each cluster, you’ll have a corresponding AutoPedigree file if an AutoPedigree can be generated from the trees of the people in that cluster.

My first cluster is simply too large to show successfully in blog format, so I’m selecting a smaller cluster, #21, shown below with the red arrow, with only 6 members. Why so small, you ask? In part, because I want to illustrate the fact that you really don’t need a lot of matches for the AutoPedigree tool to be useful.

Autopedigree multiple clusters

Note also that this entire group of clusters (blue through brown) has members in more than one cluster, indicated by the grey cells that mean someone is a member of at least 2 clusters. That tells me that I need to include the information from those clusters too in my analysis. Fortunately, Genetic Affairs realizes that and provides a combined AutoPedigree tool for that as well, which we will cover later in the article. Just note for now that the blue through brown clusters seem to be related to cluster 21.

Let’s look at cluster 21.

autopedigree cluster 21

In the AutoPedigree folder, you’ll see cluster files when there are trees available to create pedigrees for individual clusters. If you’re lucky, you’ll find 2 files for some clusters.

autopedigree ancestors

At the top of each cluster AutoPedigree file, Genetic Affairs shows you the home couple of the descendant group shown in the matches and their corresponding trees.

Autopedigree WATO chart

Image 1 – click to enlarge

I don’t expect you to be able to read everything in the above pedigree chart, just note the matches and arrows.

You can see three of my cousins who match, labeled with “Ancestry.” You also see branches that generate a viable hypothesis. When generating AutoPedigrees, Genetic Affairs truncates any branches that cannot result in a viable hypothesis for placing the tester in a viable location on the tree, so you may not see all matches.

Autopedigree hyp 1

Image 2 – click to enlarge

On the top branch, you’ll see hyp-1-child1 which is the first hypothesis, with the first child. Their child is hyp-2- child2, and their child is hyp-3-child3. The tester (me, in this case) cannot be the persons shown with red flags, called badges, based on how I match other people and other tree information such as birth and death dates.

Think of a stoplight, red=no, green are your best bets and the rest are yellow, meaning maybe. AutoPedigree makes no decisions, only shows you options, and calculated mathematically how probable each location is to be correct.

Remember, these “children,” meaning hypothesis 1-child 1 may or may not have actually existed. These relationships are hypothetical showing you that IF these people existed, where the tester could appear on the tree.

We know that I don’t fit on the branch above hypothesis 1, because I only match the descendant of Adam Lentz at 44.2 cM which is statistically too low for me to also inhabit that branch.

I’ve included half relationships, so we see hyp-7-child1-half too, which is a half-sibling.

The rankings for hypotheses 1, 2, and 7 all have red badges, meaning not possible, so they have a score of 0. Hypothesis 3 and 8 are possible, with a ranking of 16, respectively.

autopedigree my location

Image 3 – click to enlarge

Looking now at the next segment of the tree, you see that based on how I match my Deatsman and Hartman cousins, I can potentially fit in any portion of the tree with green badges (in the red boxes) or yellow badges.

You can also see where I actually fit in the tree. HOWEVER, that placement is from AutoTree, the tree reconstruction portion, based on the fact that I have a tree (or someone has a tree with me in it). My own tree is ignored for hypothesis generation for the AutoPedigree hypothesis generation portion.

Had my first cousins once removed through my grandfather John Ferverda’s brother, Roscoe, tested AND HAD A TREE, there would have been no question where I fit based on how I match them.

autopedigree cousins

As it turns out they did test, but provided no tree meaning that Genetic Affairs had no tree to work with.

Remember that I mentioned that my first cluster was huge. Many more matches mean that Genetic Affairs has more to work with. From that cluster, here’s an example of a hypothesis being accurate.

autopedigree correct

Image 4 – click to enlarge

You can see the hypothetical line beneath my own line, with hypothesis 104, 105, 106, 107, 108. The AutoTree portion of my tree is shown above, with my father and grandparents and my name in the green block. The AutoPedigree portion ignores my own tree, therefore generating the hypothesis that’s where I could fit with a rank of 2. And yes, that’s exactly where I fit in the tree.

In this case, there were some hypotheses ranked at 1, but they were incorrect, so be sure to evaluate all good (green) options, then yellow, in that order.

Genetic Affairs cannot work with 23andMe results for AutoPedigree because 23andMe doesn’t provide or support trees on their site. AutoClusters are integrated at MyHeritage, but not the AutoTree or AutoPedigree functions, and they cannot be run separately.

That leaves Family Tree DNA and Ancestry.

Combined AutoPedigree

After evaluating each of the AutoPedigrees generated for each cluster for which an AutoPedigree can be generated, click on the various cluster combined autopedigrees.

autopedigree combined

You can see that for cluster 1, I have 7 separate AutoPedigrees based on common ancestors that were different. I have 3 AutoPedigrees also for cluster 9, and 2 AutoPedigrees for 15, 21, and 24.

I have no AutoPedigrees for clusters 2, 3, 5, 6, 7, 8, 14, 17, 18, and 22.

Moving to the combined clusters, the numbers of which are NOT correlated to the clusters themselves, Genetic Affairs has searched trees and combined ancestors in various clusters together when common ancestors were found.

Autopedigree multiple clusters

Remember that I asked you to note that the above blue through brown clusters seem to have commonality between the clusters based on grey cell matches who are found in multiple groups? In fact, these people do share common ancestors, with a large combined AutoPedigree being generated from those multiple clusters.

I know you can’t read the tree in the image that follows. I’m only including it so you’ll see the scale of that portion of my tree that can be reconstructed from my matches with hypotheses of where I fit.

autopedigree huge

Image 5 – click to enlarge

These larger combined pedigrees are very useful to tie the clusters together and understand how you match numerous people who descend from the same larger ancestral group, further back in time.

Integration with DNAPainter

autopedigree wato file

Each AutoPedigree file and combined cluster AutoPedigree file in the AutoPedigree folder is provided in WATO format, allowing you to import them into DNAPainter’s WATO tool.

autopedigree dnapainter import

You can manually flesh out the trees based on actual genealogy in WATO at DNAPainter, manually add matches from GEDmatch, 23andMe or MyHeritage or matches from vendors where your matches trees may not exist but you know how your match connects to you.

Your AutoTree Ancestors

But wait, there’s more.

autopedigree ancestors folder

If you click on the Ancestors folder, you’ll see 5 options for tree generations 3-7.

autopedigree ancestor generations

My three-generation auto-generated reconstructed tree looks like this:

autopedigree my tree

Selecting the 5th generation level displays Jacob Lentz and Frederica Ruhle, the couple shown in the AutoCluster 21 and AutoPedigree examples earlier. The color-coding indicates the source of the ancestors in that position.

Autopedigree expanded tree

click to enlarge

You will also note that Genetic Affairs indicates how many matches I have that share this common ancestor along with which clusters to view for matches relevant to specific ancestors. How cool is this?!!

Remember that you can also import the genetic match information for each AutoTree cluster found at Family Tree DNA into DNAPainter to paint those matches on your chromosomes using DNAPainter’s Cluster Auto Painter.

If you run AutoCluster for matches at 23andMe, MyHeritage, or FamilyTreeDNA, all vendors who provide segment information, you can also import that cluster segment information into DNAPainter for chromosome painting.

However, from that list of vendors, you can only generate AutoTrees and AutoPedigrees at Family Tree DNA. Given this, it’s in your best interest for your matches to test at or upload their DNA (plus tree) to Family Tree DNA who supports trees AND provides segment information, both, and where you can run AutoTree and AutoPedigree.

Have you painted your clusters or generated AutoTrees? If you’re an adoptee or looking for an unknown parent or grandparent, the new AutoPedigree function is exactly what you need.

Documentation

Genetic Affairs provides complete instructions for AutoPedigree in this newsletter, along with a user manual here, and the Facebook Genetic Affairs User Group can be found here.

I wrote the introductory article, AutoClustering by Genetic Affairs, here, and Genetic Affairs Reconstructs Trees from Genetic Clusters – Even Without Your Tree or Common Ancestors, here. You can read about DNAPainter, here.

Transfer your DNA file, for free, from Ancestry to Family Tree DNA or MyHeritage, by following the easy instructions, here.

Have fun! Your ancestors are waiting.

_____________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Products and Services

Genealogy Research

 

Shared cM Project 2020 Analysis, Comparison & Handy Reference Charts

Recently, Blaine Bettinger published V4 of the Shared cM Project, and along with that, Jonny Perl at DNAPainter updated the associated interactive tool as well, including histograms. I wrote about that, here.

The goal of the shared cM project was and remains to document how much DNA can be expected to be shared by various individuals at specific relationship levels. This information allows matches to at least minimally “position” themselves in a general location their trees or conversely, to eliminate specific potential relationships.

Shared cM Project match data is gathered by testers submitting their match information through the submission portal, here.

When the Shared cM Project V3 was released in September 2017, I combined information from various sources and provided an analysis of that data, including the changes from the V2 release in 2016.

I’ve done the same thing this year, adding the new data to the previous release’s table.

Compiled Comparison Table

I initially compiled this table for myself, then decided to update it and share with my readers. This chart allows me to view various perspectives on shared data and relationships and in essence has all the data I might need, including multiple versions, in one place. Feel free to copy and save the table.

In the comparison table below, the relationship rows with data from various sources is shown as follows:

  • White – Shared cM Project 2016
  • Peach – Shared cM Project 2017
  • Purple – Shared cM Project 2020
  • Green – DNA Detectives chart

I don’t know if DNA Detectives still uses the “green chart” or if they have moved to the interactive DNAPainter tool. I’ve retained the numbers for historical reference regardless.

Additionally, in some places, you’ll see references to the “degree of relationship,” as in “third degree relatives always match each other.” I’ve included a “Degree of Relationship” column to the far right, but I don’t come across those “relationship degree” references often anymore either. However, it’s here for reference if you need it.

23andMe still gives relationships in percentages, so I’ve included the expected shared percent of DNA for each relationship and the actual shared range from the DNA Detectives Green Chart.

One column shows the expected shared cM amount, assuming that 50% of the DNA from each ancestor is passed on in each generation. Clearly, we know that inheritance doesn’t happen that cleanly because recombination is a random event and children do NOT inherit exactly half of each ancestor’s DNA carried by their parents, but the average should be someplace close to this number.

shared cm table 2020

click to open separately, then use your magnifier to enlarge

The first thing I noticed about V4 is that there is a LOT more data which means that the results are likely more accurate. V4 increased by 32K data points, or 147%. Bravo to everyone who participated, to Blaine for the analysis and to Jonny for automating the results at DNAPainter.

Methods

Blaine provided his white paper, here, which includes “everything you need to know” about the project, and I strongly encourage you to read it. Not only does this document explain the process and methods, it’s educational in its own right.

On the first page, Blaine discusses issues. Any time you are crowd sourcing information, you’re going to encounter challenges and errors. Blaine did remove any entries that were clearly problematic, plus an additional 1% of all entries for each category – .5% from each end meaning the largest and smallest entries. This was done in an attempt to remove the results most likely to be erroneous.

Known issues include:

  • Data entry errors – I refer to these as “clerical mutations,” but they happen and there is no way, unless the error is egregious, to know what is a typo and what is real. Obviously, a parent sharing only a 10 cM segment with a child is not possible, but other data entry errors are well within the realm of possible.
  • Incorrect relationships – Misreported or misunderstood relationships will skew the numbers. Relationships may be believed to be one type, but are actually something else. For example, a half vs full sibling, or a half vs full aunt or uncle.
  • Misunderstood Relationships – People sometimes become confused as to the difference between “half” and “removed” from time to time. I wrote a helpful article titled Quick Tip – Calculating Cousin Relationships Easily.
  • Endogamy – Endogamy occurs when a population intermarries within itself, meaning that the same ancestral DNA is present in many members of the community. This genetic result is that you may share more DNA with those cousins than you would otherwise share with cousins at the same distance without endogamy.
  • Pedigree Collapse – Pedigree collapse occurs when you find the same ancestors multiple times in your tree. The closer to current those ancestors appear, the more DNA you will potentially carry from those repeat ancestors. The difference between endogamy and pedigree collapse is that endogamy is a community event and pedigree collapse has only to do with your own tree. You might just have both, too.
  • Company Reporting Differences – Different companies report DNA in different ways in addition to having different matching thresholds. For example, Family Tree DNA includes in your match total all DNA to 1 cM that you share with a match over the matching threshold. Conversely, Ancestry has a lower matching threshold, but often strips out some matching DNA using Timber. 23andMe counts fully identical segments twice and reports the X chromosome in their totals. MyHeritage does not report the X chromosome. There is no “right” or “wrong,” or standardization, simply different approaches. Hopefully, the variances will be removed or smoothed in the averages.
  • Distant Cousin Relationships – While this isn’t really an issue, per se, it’s important to understand what is being reported beyond 2nd cousin relationships in that the only relationships used to calculate these averages is the DNA from people who DO share DNA with their more distant cousins. In other words, if you do NOT match your 3rd cousin, then your “0” shared DNA is not included in the average. Only those who do match have their matching amounts included. This means that the average is only the average of people who match, not the average of all 3rd cousins.

Challenges aside, the Shared cM Project provides genealogists with a wonderful opportunity to use the combined data of tens of thousands of relationships to estimate and better understand the relationship range of our matches.

The Shared cM Project in combination with DNAPainter provides us with a wonderful tool.

Histograms

When analyzing the data, one of the first things I noticed was a very unusual entry for parent/child relationships.

We all know that children each inherit exactly half of their parent’s DNA. We expect to find an amount in the ballpark of 3400, give or take a bit for normal variances like read errors or reporting differences.

Shared cM parent child.png

click to enlarge

I did not expect to see a minimum shared cM amount for a child/parent relationship at 2376, fully 1024 cM below expected value of 3400 cM. Put bluntly, that’s simply not possible. You cannot live without one third of one of your parent’s DNA. If this data is actually accurate from someone’s account, please contact me because I want to actually see this phenomenon.

I reached out to Blaine, knowing this result is not actually possible, wondering how this would ever get through the quality control cycle at any vendor.

After some discussion, here’s Blaine’s reply:

If you look at the histogram, you’ll see that those are most likely outliers. One of my lessons for the ScP (Shared cM Project) lately is that people shouldn’t be using the data without the histograms.

People get frustrated with this, but I can’t edit data without a basis even if I think it doesn’t make sense. I have to let the data itself decide what data to remove. So I removed 1% from each relationship, the lowest 0.5% and the highest 0.5%. I could have removed more, but based on the histograms, [removing] more appeared to be removing too much valid data. As people submit more parent/child relationships these outliers/incorrect submissions will be removed. But thankfully using the histograms makes it clear.

Indeed, if you look on page 23 on Blaine’s white paper, you’ll see the following histogram of parent/child relationships submitted.

shared cm histogram.png

click to enlarge

Keep in mind that Blaine already removed any obvious errors, plus 1% of the total from either end of the spectrum. In this case, he utilized 2412 submissions, so he would have removed about 24 entries that were even further out on the data spectrum.

On the chart above, we can see that a total of about 14 are still really questionable. It’s not until we get to 3300 that these entries seem feasible. My speculation is that these people meant to type 3400 instead of 2400, and so forth.

shared cm parent grid.png

click to enlarge

The great news is that Jonny Perl at DNAPainter included the histograms so you can judge for yourself if you are in the weeds on the outlier scale by clicking on the relationship.

shared cm parent submissions.png

click to enlarge

Other relationships, like this niece/nephew relationship fit the expected bell shaped curve very nicely.

shared cm niece.png

Of course, this means that if you match your niece or nephew at 900 cM instead of the range shown above, that person is probably not your full niece or nephew – a revelation that may be difficult because of the implications for you, your parent and sibling. This would suggest that your sibling is a half sibling, not a full sibling.

Entering specific amounts of shared DNA and outputting probabilities of specific relationships is where the power of DNAPainter enters the picture. Let’s enter 900 cM and see what happens.

shared cm half niece.png

That 900 cM match is likely your half niece or nephew. Of course, this example illustrates perfectly why some relationships are entered incorrectly – especially if you don’t know that your niece or nephew is a half niece or nephew – because your sibling is a half-sibling instead of a full sibling. Some people, even after receiving results don’t realize there is a discrepancy, either because their data is on the boundary, with various relationships being possible, or because they don’t understand or internalize the genetic message.

shared cm full siblings.png

click to enlarge

This phenomenon probably explains the low minimum value for full siblings, because many of those full siblings aren’t. Let’s enter 1613 and see what DNAPainter says.

shared cm half sibling.png

You’ll notice that DNAPainter shows the 1613 cM relationship as a half-sibling.

shared cm sibling.png

And the histogram indeed shows that 1613 would be the outlier. Being larger that 1600, it would appear in the 1700 category.

shared cm half vs full.png

click to enlarge

Accurately discerning close relationships is often incredibly important to testers. In the histogram chart above, you can see that the blue and orange histograms plotted on the same chart show that there is only a very small amount of overlap between the two histograms. This suggests that some people, those in the overlap range, who believe they are full siblings are in reality half-siblings, and possibly, a few in the reverse situation as well.

What Else is Noteworthy?

First, some relationships cannot be differentiated or sorted out by using the cM data or histogram charts alone.

shared cm half vs aunt.png

click to enlarge

For example, you cannot tell the difference between half-siblings and an aunt/uncle relationship. In order to make that determination, you would need to either test or compare to additional people or use other clues such as genealogical research or geographic proximity.

Second, the ranges of many relationships are wider than they were before. Often, we see the lows being lower and the highs being higher as a result of more data.

shared cm low high.png

click to enlarge

For example, take a look at grandparents. The expected relationship is 1700 cM, the average is 1754 which is very close to the previous average numbers of 1765 and 1766. However, the minimum is now 984 and the new maximum is 2462.

Why might this be? Are ranges actually wider?

Blaine removed 1% each time, which means that in V3, 6 results would have been removed, 3 from each end, while 11 would be removed in V4. More data means that we are likely to see more outliers as entries increase, with the relationship ranges are increasingly likely to overlap on the minimum and maximum ends.

Third, it’s worth noting that several relationships share an expected amount of DNA that is equal, 12.5% which equals 850 cM, in this example.

shared cm 4 relationships.png

click to enlarge

These four relationships appear to be exactly the same, genetically. The only way to tell which one of these relationships is accurate for a given match pair, aside from age (sometimes) and opportunity, is to look at another known relationship. For example, how closely might the tester be related to a parent, sibling, aunt, uncle or first cousin, or one of their other matches. Occasionally, an X chromosome match will be enlightening as well, given the unique inheritance path of the X chromosome.

Additional known relationships help narrow unknown relationships, as might Y DNA or mitochondrial DNA testing, if appropriate. You can read about who can test for the various kinds of tests, here.

Fourth, it’s been believed for several years that all 5th degree relatives, and above, match, and the V4 data confirms that.

shared cm 5th degree.png

click to enlarge

There are no zeroes in the column for minimum DNA shared, 4th column from right.

5th degree relatives include:

  • 2nd cousins
  • 1st cousins twice removed
  • Half first cousins once removed
  • Half great-aunt/uncle

Fifth, some of your more distant cousins won’t match you, beginning with 6th degree relationships.

shared cm disagree.png

click to enlarge

At the 6th degree level, the following relationships may share no DNA above the vendor matching threshold:

  • First cousins three times removed
  • Half first cousins twice removed
  • Half second cousins
  • Second cousins once removed

You’ll notice that the various reporting models and versions don’t always agree, with earlier versions of the Shared cM Project showing zeroes in the minimum amount of DNA shared.

Sixth, at the 7th degree level, some number of people in every relationship class don’t share DNA, as indicated by the zeros in the Shared cM Minimum column.

shared cm 7th degree.png

click to enlarge

The more generations back in time that you move, the fewer cousins can be expected to match.

shared cm isogg cousin match.png

This chart from the ISOGG Wiki Cousin statistics page shows the probability of matching a cousin at a specific level based on information provided by testing companies.

Quick Reference Chart Summary

In summary, V4 of the Shared cM Project confirms that all 2nd cousins can expect to match, but beyond that in your trees, cousins may or may not match. I suspect, without evidence, that the further back in time that people are related, the less likely that the proper “cousinship level” is reported. For example, it would be easier to confuse 7th and 8th cousins as compared to 1st and 2nd cousins. Some people also confuse 8th cousins with 8 generations back in your tree. It’s not equivalent.

shared cm eighth cousin.png

click to enlarge

It’s interesting to note that Degree 17 relatives, 8th cousins, 9 generations removed from each other (counting your parents as generation 1), still match in some cases. Note that some companies and people count you as generation 1, while others count your parents as generation 1.

The estimates of autosomal matching reaching 5 or 6 generations back in time, meaning descendants of common 4 times great-grandparents will sometimes match, is accurate as far as it goes, although 5-6 generations is certainly not a line in the sand.

It would be more accurate to state that:

  • 2nd cousins, people descended from common great-grandparents, 3 generations back in time will always match
  • 4th cousins, people descended from common 3 times great grandparents, 5 generations back in time, will match about half of the time
  • 8th cousins, people descended from 7 times great grandparents, 9 generations back in time still match a small percentage of the time
  • Cousins from more distant ancestors can possibly match, but it’s unlikely and may result from a more recent unknown ancestor

I created this summary chart, combining information from the ISOGG chart and the Shared cM Project as a handy quick reference. Enjoy!

shared cm quick reference.png

click to enlarge

_____________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Products and Services

Genealogy Research

Fun DNA Stuff

  • Celebrate DNA – customized DNA themed t-shirts, bags and other items

The Shared cM Project Version 4 Released

Version 4 of the Shared cM Project has been released, utilizing over 60,000 known relationship results submitted by genealogists. The Shared cM Project was begun in 2015 by Blaine Bettinger in order to crowd-source the actual number of shared centiMorgans, cMs, of variously related people who match each through autosomal DNA testing.

Obviously, in order to contribute to the Shared cM Project and participate, you must know how you are related to your matches. You can read about the earlier versions of the project, here.

The Shared cM Project has been very useful for genealogists attempting to determine potential relationships of unknown testers, in particular, because sometimes what we “expect” to see based on academic predictions and models isn’t actually what happens.

Of course, the flip side of that is that sometimes people who contribute relationships don’t understand or report relationships accurately; specifically relationships such as “half,” and “removed.” Nonetheless, with enough data, these reporting errors become statistical outliers. You can participate by contributing your known relationship data through the portal, here.

Blaine’s blog about the new V4 version is here and the full 56-page pdf paper about the results and methodology is here. If you want to understand how the project works, not only is this paper essential reading, it’s a wonderful educational source.

DNAPainter

By far, the most common usage of The Shared cM Project results is the interactive tool created at DNAPainter by Jonny Perl.

V4 DNAPainter

The Shared cM Project tools are found under the Tools and WATO tab, here.

V4 DNAPainter shared

Click on Shared cM Tool when navigating from the main DNAPainter page.

V4 DNAPainter complete chart

You’ll see the updated V4 relationship chart, with the field to enter the amount of shared cMs between you and a match above the chart, shown partially above.

V4 DNAPainter result

Selecting a cM number at random, I entered 1300. The results show the probabilities of various relationships between two people who match at 1300 cMs.

V4 DNAPainter table

1300 shared cMs can be any of the relationships shown, above. The grey, faded background relationships are not candidates at 1300 cMs, according to V4 of the Shared cM Project.

V4 DNAPainter histogram

A new feature added by Jonny provides the ability to click on a relationship and view the histogram from The Shared cM Project showing the submitted relationship amounts. For aunt/uncle at 1300 cMs, 26 people reported that matching amount. The most common amount of shared DNA was 1800 for that relationship category.

You can read Jonny’s latest blog introducing these new features, here.

Thanks to all of the 60,000+ contributors, Blaine and Jonny who made this possible.

_____________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Products and Services

Genealogy Research

Fun DNA Stuff

  • Celebrate DNA – customized DNA themed t-shirts, bags and other items

DNAPainter Instructions and Resources

DNAPainter garden

DNAPainter is one of my favorite tools because DNAPainter, just as its name implies, facilitates users painting their matches’ segments on their various chromosomes. It’s genetic art and your ancestors provide the paint!

People use DNAPainter in different ways for various purposes. I utilize DNAPainter to paint matches with whom I’ve identified a common ancestor and therefore know the historical “identity” of the ancestors who contributed that segment.

Those colors in the graphic above are segments identified to different ancestors through DNA matching.

DNAPainter includes:

  • The ability to paint or map your chromosomes with your matching segments as well as your ethnicity segments
  • The ability to upload or create trees and mark individuals you’ve confirmed as your genetic ancestors
  • A number of tools including the Shared cM Tool to show ranges of relationships based on your match level and WATO (what are the odds) tool to statistically predict or estimate various positions in a family based on relationships to other known family members

A Repository

I’ve created this article as a quick-reference instructional repository for the articles I’ve written about DNAPainter. As I write more articles, I’ll add them here as well.

  • The Chromosome Sudoku article introduced DNAPainter and how to use the tool. This is a step-by-step guide for beginners.

DNA Painter – Chromosome Sudoku for Genetic Genealogy Addicts

  • Where do you find those matches to paint? At the vendors such as Family Tree DNA, MyHeritage, 23andMe and GedMatch, of course. The Mining Vendor Matches article explains how.

DNAPainter – Mining Vendor Matches to Paint Your Chromosomes

  • Touring the Chromosome Garden explains how to interpret the results of DNAPainter, and how automatic triangulation just “happens” as you paint. I also discuss ethnicity painting and how to handle questionable ancestors.

DNA Painter – Touring the Chromosome Garden

  • You can prove or disprove a half-sibling relationship using DNAPainter – for you and also for other people in your tree.

Proving or Disproving a Half Sibling Relationship Using DNAPainter

  • Not long after Dana Leeds introduced The Leeds Method of clustering matches into 4 groups representing your 4 grandparents, I adapted her method to DNAPainter.

DNAPainter: Painting the Leeds Method Matches

  • Ethnicity painting is a wonderful tool to help identify Native American or minority ancestry segments by utilizing your estimated ethnicity segments. Minority in this context means minority to you.

Native American and Minority Ancestors Identified Using DNAPainter Plus Ethnicity Segments

  • Creating a tree or uploading a GEDCOM file provides you with Ancestral Trees where you can indicate which people in your tree are genetically confirmed as your ancestors.

DNAPainter: Ancestral Trees

  • Of course, the key to DNA painting is to have as many matches and segments as possible identified to specific ancestors. In order to do that, you need to have your DNA working for you at as many vendors as possible that provide you with matching and a chromosome browser. Ancestry does not have a browser or provide specific paintable segment information, but the other major vendors do, and you can transfer Ancestry results elsewhere.

DNAPainter: Painting “Bucketed” Family Tree DNA Maternal and Paternal Family Finder Matches in One Fell Swoop

  • Family Tree DNA offers the wonderful feature of assigning your matches to either a maternal or paternal bucket if you connect 4th cousins or closer on your tree. Until now, there was no way to paint that information at DNAPainter en masse, only manually one at a time. DNAPainter’s new tool facilitates a mass painting of phased, parentally bucketed matches to the appropriate chromosome – meaning that triangulation groups are automatically formed!

Triangulation in Action at DNAPainter

  • DNAPainter provides the ability to triangulate “automatically” when you paint your segments as long as you know which side, maternal or paternal, the match originates. Looking at the common ancestors of your matches on a specific segments tracks that segment back in time to its origins. Painting matches from all vendors who provide segment information facilitates once single repository for walking your DNA information back in time.

DNA Transfers

Some vendors don’t require you to test at their company and allow transfers into their systems from other vendors. Those vendors do charge a small fee to unlock their advanced features, but not as much as testing there.

Ancestry and 23andMe DO NOT allow transfers of DNA from other vendors INTO their systems, but they do allow you to download your raw DNA file to transfer TO other vendors.

Family Tree DNA, MyHeritage and GedMatch all 3 accept files uploaded FROM other vendors. Family Tree DNA and MyHeritage also allow you to download your raw data file to transfer TO other vendors.

These articles provide step-by-step instructions how to download your results from the various vendors and how to upload to that vendor, when possible.

Here are some suggestions about DNA testing and a transfer strategy:

Paint and have fun!!!

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Proving or Disproving a Half Sibling Relationship Using DNAPainter

I had this nagging match at MyHeritage for some time who had not responded to messages and who didn’t have a tree. When she did reply, she explained that she was adopted, but I had already been working on how she was related.

Initially, I didn’t think too much of the match, especially when she didn’t reply, but after SmartMatching and Triangulation appeared on the scene, this match haunted me just about daily. Who the heck was Dee? We share enough DNA that we might even share a family resemblance.

Recently, when I became focused on my Dad’s life and (ahem) bad-boy mis-adventures once again, I realized that while this clearly isn’t a half-sibling match, my half-sibling would likely be long-deceased. I was born late in my father’s life and he was breaking hearts 40 years earlier – which means he could also have been fathering children. Dee could be my half-sibling’s child or grandchild.

Let’s take a look at this situation and how I used DNAPainter to quickly narrow the possibilities, even with no additional information.

The Problem

Here’s my match to Dee (not her name) at MyHeritage.

Dee matches me at 521 cM on 17 segments.

Taking a quick look at the DNAPainter Shared cM Tool, you can see that Dee falls into the non-dimmed relationship ranges below, with dark grey being the most probable.

The most likely relationships are shown in the table below.

Dee is in her 50s, so she’s clearly not my great aunt or uncle or grandparent.

The Possibilities

Based on who she matches, I know the match is from my father’s side. I have no full siblings and my mother’s DNA is at MyHeritage.

My father could have been begetting children beginning about 1917 or so and could have continued through his death in 1963.

My half sister’s daughter has also tested at MyHeritage, and Dee matches her more distantly than me, so Dee is not an unknown descendant of my half-sister.

Dee could have been a child or grandchild of a half sibling that I’m unaware of – which of course is my burning question.

I checked the in-common-with matches and while they made sense, I needed something much faster than working with multiple trees and matches and attempting to build them out.

Besides, I desperately wanted a quick answer.

DNAPainter to the Rescue

I’ve written three previous articles about utilizing DNAPainter.

I continue to paint matches where I can identify known ancestors. Currently, I’m up to 689 segments identified and painted which is about 62% of my genome.

Surely this investment should pay off now, if I can only figure out how.

I’ve painted hundreds of segments on both my paternal grandmother and grandfather’s sides. If Dee is a half sibling (descendant) to me, she will match both my paternal grandmother’s line and my paternal grandfather’s line. If Dee is related on one of those lines, but not the other, then Dee will match one grandparent’s line, but not the other grandparent’s line.

Dee can’t be descended from a half sibling if she doesn’t match both of my paternal grandparents, meaning William George Estes and Ollie Bolton’s lines.

Painting

The first thing I did was to paint the segments where Dee and I match, assigning a unique color.

After painting, I compared each chromosome individually, looking at the other ancestors painted that overlapped with the bright yellow.

The next step was to look at each chromosome and see which ancestor’s DNA overlaps with Dee’s.

Without fail, every single one of these segments matched with my paternal grandfather’s side, and none matched with my paternal grandmother’s side.

To confirm, I have a cousin, we’ll call him Buzz, whose ancestor was my grandmother’s brother, so Buzz is my second cousin. If Dee is my half sibling’s child or grandchild, Buzz, who also tested at MyHeritage, would be Dee’s second cousin or second cousin once removed. No second cousins have ever been proven NOT to match, so it’s extremely unlikely that Dee is descended through Ollie Bolton.

Is there a very small possibility? Yes, if Dee is actually a second cousin twice removed from Buzz, which is genetically the equivalent of a third cousin. Third cousins only match about 90% of the time.

However, Dee also doesn’t match anyone else on my grandmother’s side, so it’s very unlikely that Dee descends from Ollie Bolton’s parents, Joseph “Dode” Bolton and Margaret Clarkson/Claxton.

Therefore, we’ve just “proven,” as best we can, that Dee does NOT descend from a previously unknown half-sibling.

We’ll just pause for a minute here – I was so hopeful☹

Regroup – Other Possible Relationships

OK, redraw the chart without Ollie. Dee is still very closely related, so what are the other possibilities?

Dee does match people with ancestors from both the lines of Lazarus Estes and Elizabeth Vannoy, so Dee is either an unknown descendant of William George Estes or his parents, given how closely she matches me and other descendants of this family.

Or… as luck would have it, Dee could also be descended from the sister of Lazarus Estes (Elizabeth Estes) who married the bother of Elizabeth Vannoy (William George Vannoy.) Yes, siblings married siblings. Two children of Joel Vannoy and Phoebe Crumley married two children of John Y. Estes and Rutha (or Ruthy) Dodson.

You know, these mysteries can never be simple, can they?

In the chart above, gold represents the people who descend from a combination of a pink and blue couple. Joel Vannoy and Phoebe Crumley are shown twice because there was no easy way to display this couple.

One way or another Dee and I are related through these two couples. Of course, I’m curious as to how, and excited to help Dee learn about her family, but this isn’t going to be an easy solve, because of the potential double descent. Under normal circumstances, meaning NOT doubly related, Dee is most likely my half-great niece, meaning that her unknown grandparent is either a child of William George Estes (my grandfather) or descended from his parents, Lazarus Estes and Elizabeth Vannoy.

However, the doubling of DNA in the William George Vannoy/Elizabeth Estes line would make Dee look a generation closer if she descends from that line, so the genetic equivalent of descending from Lazarus Estes and Elizabeth Vannoy. The only way to solve for this equation would be to see how closely she matches a descendant of Elizabeth Estes and William George Vannoy – and no one from that line is known to have tested today.

For now, my driving question of whether I had discovered an unknown half-sibling has (most probably) been answered between the segment information at MyHeritage combined with the functionality of DNAPainter.

First Cousin Match Simulations

Update: Please note that in August of 2019, this article was updated to reflect 200,000 simulations as opposted to the original 80,000, along with other applicable statistics.

Have you ever wondered if your match with your first cousin is “normal,” or what the range of normal is for a first cousin match? How would we know? And if your result doesn’t fall into the expected range, does that mean it’s wrong? Does gender make a difference?

If you haven’t wondered some version of these questions yet, you will eventually, don’t worry! Yep, the things that keep genetic genealogists awake at night…

Philip Gammon, our statistician friend who wrote the Match-Maker-Breaker tool for parental match phasing has continued to perform research. In his latest endeavor, he has created a tool that simulates the matching between individuals of a given relationship. Philip is planning to submit a paper describing the tool and its underlying model for academic publication, but he has agreed to give us a sneak peek. Thanks Philip!

In this example, Philip simulated matching between first cousins.

The data presented here is the result of 200,000 simulations:

First cousin simulation V2

Philip was interested in this particular outcome in order to understand why his father shared 1206 cM with a first cousin, and if that was an outlier, since it is not near the average produced from the Shared cM Project (2017 revision) coordinated by Blaine Bettinger.

Academically calculated expectations suggest first cousins should share 850 cM. The data collected by Blaine showed an actual average of 874 cM, but varied within a 99th percentile range of 553 to 1225 cM utilizing 1512 respondents. You can view the expected values for relationships in the article, Concepts – Relationship Predictions and a second article, Shared cM Project 2017 Update Combined Chart  that includes a new chart incorporating the values from the 2016 Shared cM Project, the 2017 update and the DNA Detectives chart reflecting relationships as well.

Philip grouped the results into the same bins as used in the 2017 Shared cM Project:

First cousins shared cM format V2.png

The graph below is from the Shared cM Project tables.

Philip’s commentary regarding his simulations and The Shared cM Project’s results:

I’d say that they look very similar. The spread is just about right. The Shared cM data is a little higher but this is consistent with vendor results typically containing around 20 cM of short IBC segments. My sample size is more than 100 times greater so this gives more opportunity to observe extreme values. I observed 25 events exceeding 1410 cM, with a maximum of 1604 cM. At the lower end I have 787 events (about 0.4%) with fewer than 510 shared cM and a minimum of 272 cM.

I thought that the gender of the related parents of the 1st cousins would have quite an impact on the spread of the amounts shared between their children. Fewer crossovers for males means that the respective children of two brothers would be receiving on average, larger segments of DNA, so greater opportunity for either more sharing or for less. Conversely, the respective children of two sisters, with more crossovers and smaller segments, would be more tightly clustered around the average of 12.5% (855 cM in my model). There is a difference, but it’s not nearly as pronounced as I was expecting:

First cousins match curve V2.png

The most noticeable difference is in the tails. First cousins whose fathers were brothers are about two and a half times as likely to either share less than 8% or more than 17% than first cousins whose mothers were sisters. And of course, if the cousins were connected via a respective parent who were brother and sister to each other, the spread of shared cM is somewhere in between.

% DNA shared between the respective offspring of…
<8% 8-10% 10-15% 15-17% >17%
2 sisters 0.6% 8.2% 82.0% 8.2% 1.0%
1 brother, 1 sister 0.8% 9.2% 79.5% 9.1% 1.5%
2 brothers 1.6% 11.1% 74.2% 10.7% 2.4%

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Shared cM Project 2017 Update Combined Chart

The original goal of Blaine Bettinger’s Shared cM Project was to document the actual shared ranges of centiMorgans found in various relationships between testers in genetic genealogy. Previously, all we had were academically calculated models which didn’t accurately really reflect the data that genetic genealogists were seeing.

In June 2016, Blaine published the first version of the Shared cM Project information gathered collaboratively through crowd-sourcing. He continued to gather data, and has published a new 2017 version recently, along with an accompanying pdf download that explains the details. Today, more than 25,000 known relationships have been submitted by testers, along with their amount of shared DNA.

Blaine continues to accept submissions at this link, so please participate by submitting your data.

In the 2017 version, some of the numbers, especially the maximums in the more distant relationship categories changed rather dramatically. Some maximums actually doubled, meaning having more data to work with was a really good thing.

The 2017 project update refines the numbers with more accuracy, but also adds more uncertainly for people looking for nice, neat, tight relationship ranges. This project and resulting informational chart is a great tool, but you can’t now and never will be able to identify relationships with complete certainly without additional genealogical information to go along with the DNA results.

That’s the reason there is a column titled “Degree of Relationship.” Various different relationships between people can be expected to share about the same amount of DNA, so determining that relationship has to be done through a combination of DNA and other information.

When the 2016 version was released, I completed a chart that showed the expected percentage of shared DNA in various relationship categories and contrasted the expected cM of DNA against what Blaine had provided. I published the chart as part of an article titled, Concepts – Relationship Predictions. This article is still a great resource and very valid, but the chart is now out of date with the new 2017 information.

What a great reason to create a new chart to update the old one.

Thanks to Blaine and all the genetic genealogists who contributed to this important crowd-sourced citizen science project!

2016 Compared to 2017

The first thing I wanted to know was how the numbers changed from the 2016 version of the project to 2017. I combined the two years’ worth of data into one file and color coded the results. Please note that you can click on any image to enlarge.

The legend is as follows:

  • White rows = 2016 data
  • Peach rows = 2017 data for the same categories as 2016
  • Blue rows = new categories in 2017
  • Red cells = information that changed surprisingly, discussed below
  • Yellow cells = the most changed category since 2016

I was very pleased to see that Blaine was able to add data for several new relationship categories this year – meaning that there wasn’t enough information available in 2016. Those are easy to spot in the chart above, as they are blue.

Unexpected Minimum and Maximum Changes

As I looked at these results, I realized that some of the minimums increased. At first glance, this doesn’t make sense, because a minimum can get lower as the range expands, but a minimum can’t increase with the same data being used.

Had Blaine eliminated some of the data?

I thought I understood that the 2017 project simply added to the 2016 data, but if the same minimum data was included in both 2016 and 2017, why was the minimum larger in 2017? This occurred in 6 different categories.

By the same token, and applying the same logic, there are 5 categories where the maximum got smaller. That, logically, can’t happen either using the same data. The maximum could increase, but not decrease.

I know that Blaine worked with a statistician in 2016 and used a statistical algorithm to attempt to eliminate the outliers in order to, hopefully, eliminate errors in data entry, misunderstandings about the proper terms for relationships and relationships that were misunderstood either through genealogy or perhaps an unknown genetic link. Of course, issues like endogamy will affect these calculations too.

A couple good examples would be half siblings who thought they were full siblings, or half first cousins instead of just first cousins. The terminology “once removed” confuses people too.

You can read about the proper terminology for relationships between people in the article, Quick Tip – Calculating Cousin Relationships Easily.

In other words, Blaine had to take all of these qualifiers that relate to data quality into consideration.

Blaine’s Explanation

I asked Blaine about the unusual changes. He has given me permission to quote his response, below:

The maximum and minimum aren’t the largest and smallest numbers people have submitted, they’re the submissions statistically identified by the entire dataset as being either the 95th percentile maximum and minimum, or the 99th percentile maximum and minimum. As a result, the max or min can move in either direction. Think of it in terms of the histograms; if the peak of the histogram moves to the right or left due to a lot more data, then the shoulders (5 & 95% or the 1 and 99%) of the histogram will move as well, either to the right or left.

So, for example, substantially more data for 1C2R revealed that the previously minimum was too low, and has corrected it. There are still 1C2R submissions down there below the minimum of 43, and there are submissions above the maximum of 531, but the entire dataset for 1C2R has statistically identified those submissions as being outliers

The histogram for 1C2R supports that as well, showing that there are submissions above 531, but they are clearly outliers:

People submit “bad” numbers for relationships, either due to data entry errors, incorrect genealogies, unknown pedigree collapse, or other reasons. Unless I did this statistical analysis, the project would be useless because every relationship would have an exorbitant range. The 95th and 99th percentiles help keep the ranges in check by identifying the reasonable upper and lower boundaries.

Adding Additional Information

The reason I created this chart was not initially to share, but because I use the information all the time and wanted it in one easily accessible location.

I appreciate the work that Blaine has done to eliminate outliers, but in some cases, those outliers, although in the statistical 1%, will be accurate. In other cases, they clearly won’t, or they will be accurate but not relevant due to endogamy and pedigree collapse. How do you know? You don’t.

In the pdf that Blaine provides, he does us the additional service by breaking the results down by testing vendors: 23andMe, Ancestry and Family Tree DNA, and comparison service, GedMatch. He also provides endogamous and non-endogamous results, when known.

The vendor where an individual tests does have an impact on both the testing, the matching and the reporting. For example, Family Tree DNA includes all matches to the 1cM level in total cM, Ancestry strips out DNA they think is “too matchy” with their Timber algorithm, so their total cM will be much smaller than Family Tree DNA, and 23andMe is the only one of the vendors to report fully identical regions by adding that number into the total shared cM a second time. This isn’t a matter of right or wrong, but a matter of different approaches.

Blaine’s vendor specific charts go a long way in accounting for those differences in the Parent/Child and Sibling charts shown below.

A Combined Chart

In order to give myself the best change of actually correctly locating not just the best fit for a relationship as predicted by total matching cM, but all possible fits, I decided to add a third data source into the chart.

The DNA Detectives Facebook Group that specializes in adoption searches has compiled their own chart based on their experiences in reconstructing families through testing. This chart is often referred to simply as “the green chart” and therefore, I have added that information as well, rows colored green (of course), and combined it into the chart.

I modified the headings for this combined chart, slightly, and added a column for actual shared percent since the DNA Detectives chart provides that information.

I have also changed the coloring on the blue rows, which were new in 2017, to be the same as the rest of Blaine’s 2017 peach colored rows.

I hope you find this combined chart as useful as I do. Feel free to share, but please include the link to this article and credit appropriately, for my work compiling the chart as well as Blaine’s work on the 2016 and 2017 cM Projects and DNA Detective’s work producing their “green chart.”

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research