Demystifying Ancestry’s Relationship Predictions Inspires New Relationship Estimator Tool

Today, I’m extremely pleased to bring you a wonderful guest article written by Karin Corbeil as spokesperson for a very fine group of researchers at www.dnaadoption.com.

I love it when citizen science really works, pushes the envelope, makes discoveries and then the scientists develop new tools!  This is a win-win for everyone in the genetic genealogy community – not just adoptees!  I want to say a very big thank you to this wonderful team for their fine work.

Take it away Karin….

As genetic genealogists we are always looking for a better “mousetrap”.  Tools and analyses that can better help us understand what we are actually looking at with our DNA results.  For adoptees and those with unknown ancestors it can be even more important.

When Ancestry came out with their “New Amount of Shared DNA” an explanation was necessary to understand what we were seeing.

We at DNAAdoption are asked to explain over and over again why your half-sibling was predicted as a 1st cousin, or that predicted Close Family – 1st cousin could actually be a half-nephew, or a predicted 3rd cousin could be a 4th cousin.  Ancestry doesn’t provide the detailed information needed to support their predicted relationship categories so providing the explanations was often a struggle.

We knew that you cannot draw or correlate any relationship inferences from either the total amount of shared DNA or the number of segments from the typical tools utilized by genetic genealogists because Ancestry’s totals will be lower and their segments will be broken into more pieces due to the removal of segments identified by the Timber algorithm as invalid matches.[1]

So in order to get a better reference to how predictions are set by Ancestry, we at DNAAdoption gathered data from 1,122 matches of different testers who had confirmed these matches as specific relationships. A collaborative effort was led by Richard Weiss of the DNAAdoption team.  Richard worked his magic with the data and the results are presented here.

A clip of the Pivot table from the data input:

Ancestry relationship table

The full data spreadsheet can be downloaded here:

Ancestry Predictions vs. Actual Relationships

Ancestry Predictions vs actual relationships

The most interesting thing about some of the prediction vs the actual relationships was seeing how more distant relationships can vary so greatly. Look at the 4th cousin prediction, for example. This varies from a half 1st cousin once removed to an 8th cousin once removed. (Obviously, this confirmed 8th cousin once removed probably has a persistent or intact segment that, due to the randomness of DNA down the generations, persisted for many generations). This makes it extremely difficult to assess any predicted relationship at the 4th cousin level. Even 1st, 2nd and 3rd cousin predictions had wide variances.

The only conclusion we can draw from this is to use Ancestry predictions with extreme caution.

With this data we were then able to take the numbers and add to our DNA Prediction Chart that we use in our DNA classes at DNAAdoption.

DNA Prediction Chart

DNA Prediction Chart 2

The full Excel spreadsheet can be downloaded here.

We then incorporated this data into our Relationship Estimator Tool created by Jon Masterson.

Jon explains, “This small program is intended to make the DNA Prediction Chart Spreadsheet a bit easier to use. It is based entirely on the data in this spreadsheet plus some interpolation of missing values. The algorithm to determine the most likely relationship(s) is very simple and based on summing the score of valid entries in the table for a given input. It is very much an experiment and test. It is likely to be less accurate with close relationships where there is missing data in the spreadsheet. You can also save the match information that you generate.”

First, download the zip file RelationshipEstimator.zip here.

Extract the files from the zip file and run the RelationshipEstimator.exe

relationship estimator

The following results are for the same person who has been confirmed as a 3rd cousin. The first set of data is from Gedmatch, the second set is from Ancestry. With this match the actual total cMs over 5 cMs are 122.9 with 5 segments; the same person shows Ancestry Shared DNA of 112 cMs with 7 segments.

For 23andMe/FTDNA/Gedmatch add the individual segment lengths in the first box using a slash “/” between each number.

At the “Source” box select 23andMe/FTDNA/Gedmatch, then click the “Process” button. Several possible estimated relationships will show.

Relationship estimator 2

For Ancestry, enter the total cMs, the # of segments.  At the “Source” box select “Ancestry”, then “Process”.

Relationship estimator 3

More information about this tool can be found here.

By seeing the larger variances with the Ancestry data (6 estimated relationships vs 3 for the actual Gedmatch data) we can only encourage those on Ancestry to upload your raw data file to Gedmatch. Of course, we still hope that one day Ancestry will release the full segment data in a chromosome browser.

We at DNAAdoption continue to try and provide analyses and tools, many times in cooperation with DNAGedcom, to give those searching for their roots better information. But we are “not for adoptees only” and provide this information for the genetic genealogy community as a whole.  We plan to add more data to these analyses in the near future.  We hope you will find it useful.

Your questions and comments are welcome.

Karin Corbeil (karincorbeil@gmail.com)

Diane Harman-Hoog (harmanhoog@gmail.com)

Richard Weiss (rnlweiss@gmail.com)

Jon Masterson (jon@scruffyduck.co.uk) 

[1] Roberta Estes, paraphrased from  http://dna-explained.com/2015/11/06/ancestrys-new-amount-of-shared-dna-what-does-it-really-mean/

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

35 thoughts on “Demystifying Ancestry’s Relationship Predictions Inspires New Relationship Estimator Tool

  1. AncestryDNA’s Timber knocked a known second cousin of mine down to fourth cousin status with only 39cM/5 segments! Known fourth cousins and I share about 65cM/5. Other second cousins and I share well over 170cM/10. Timber is screwy!

    That being said, on another family line Timber does appear to be in agreement with GEDMatch and FTDNA as far as the “guesstimates” go. Known papertrail fourth cousins are fourth cousins in AncestryDNA, FTDNA and GEDMatch.

    Timber and GEDMatch are in agreement that two papertrail sixth cousins come in as fourth cousins. Those sixth cousins and I may share an as yet unknown other family line somewhere.
    I have a higher level match who is an adoptee and we could be closer than what Timber says.

  2. Lately, at AncestryDNA, I have noticed many tree matches (not segment matches) with people who share only ~ 5cMs with me. ?

  3. Although the charts say that our chance of a particular 7th+ cousin making a match are 2% or less, we have to remember that the farther back we go, the more descendants these ancestors have. The probability of making a match on 8th or 9th cousins is still quite good, it’s only that the chance of a particular 8th cousin showing among our matches is very small.

  4. Roberta, last night I had a new dna match at Ancestry, 12 cMs, 1 segment. At Gedmatch, he shows with 0 cMs; and at FTDNA he is not reported as a match. We do have 2 tree matches, both 1650. Is this a result of their Timber?

    • Timber works the other way – it reduces the amount shown on Ancestry compared to the other sites rather than show more.

      I also have a match on Ancestry (the lowest of my 4th Cousin matches at 17.9 cMs) who shows as no match at all on GEDMatch.

      • I did try that. GEDMatch reported no match at all at any threshold.

        The other person contacted me and I convinced him to upload to GEDMatch. After GEDMatch showed no match I did not hear from the person again and I did not pursue it further.

  5. Thanks for the chart, just when I needed it! ^_^

    So 2nd or 3rd cousin… How bad can endogamy screw the results? Can four 8x great-grand-parents gives such a strong signal or should I keep looking for other links (and possibly Non-Paternal Events)?

    • Yes, it can change the estimate radically. The first close match I had on FTDNA was estimated at 2-3rd cousin. After we researched our trees, it turned out to be an 8th cousin with 4 sets of common ancestors (3 of the men were brothers), and some of their descendants married, so DNA coming through is much stronger.

  6. Thank you for an interesting article. I really appreciate all that DNAadoption does in providing education and tools.

    Has anyone done any studies on how the accuracy of the relationship prediction increases as more cousins are used to determine the average total cM? For example, if only one match is used compared to 4 matches?

  7. Roberta, I love all of your articles and have learned more than I ever thought possible. Thank you!! May I go off subject, please?
    I have just learned that FTDNA and MyHeritage are teaming up with a “Super Search” program. In your opinion, does this have merit?

    • I don’t know, truthfully, aside from the significant discount – and that part is nice. I should probably join to check it out. I don’t know what kinds of records they have that Ancestry does not, nor do I know if or how the DNA is involved.

      • Thank you. Between Ancestry, LDS Family History and Heritage Quest through the library, there can’t be much left to discover. Now if FTDNA can add something relating to DNA, I will be interested.

    • MyHeritage states that they have more records from areas that are outside of the USA and the British Isles than other sites. Another nice option is MyHeritage also include newspaperarchives.com which is a nice source of obituaries, marriage and birth announcements. (I can’t remember if the newspapers archives is part of the basic membership or not, so you might want to check this first)

  8. OK, I give up. I don’t see any thing on my Ancestry DNA pages about “New Amount of Shared DNA”. I finally have new a real honest to goodness 2nd cousin who shows up as a 3rd – 4th cousin on Ancestry. Since she doesn’t seen interested in uploading to GEDMatch, I would be curious to see how much DNA we share.

  9. Hello,

    Thank you for this helpful article and the link to the Combined Relationship Chart. The DNA ranges shown in Total cM’s is really helpful. I am currently working with an adopted individual in search of her family. She was adopted in the State of New York. We have successfully identified a set of distant grandparents for her via DNA. Our next step is attempting to better understand the relationships with her close DNA matches. Her closest cousin match could be a 1st cousin 1X, 1st cousin 2X, or a 2nd Cousin based on the combined relationship chart.

    Do you find that relationships typically stay within the cM ranges shown? For example, is it plausible that a 3rd cousin relationship could share more than 150 cM shown on the chart?

    I am also particular interested in how the shared DNA correlates to Half Siblings. In the case of the adopted individual I am assisting we know that her two close cousin matches are linked to a common Grandparent that was married twice and had children from each marriage.
    If you are interested in learning more about our project visit http://discovering-sally-ann.blogspot.com/

    Thank you,

    Michelle

  10. Pingback: Concepts – Identical by…Descent, State, Population and Chance | DNAeXplained – Genetic Genealogy

  11. With empathy for adoptees and their journeys to discover their biology, there are serious questions on the use of DNA and the violation of biological DNA donors rights to privacy. These rights have been upheld in four Supreme Court decisions and there are can be severe consequences. Courts in most states are sympathic and will help trail blaze a path toward information but court require an intermediary which is a healthy professional approach to all parties! Unless the goal of the adoptee is to force confrontation and acknowledgement, one might consider moving forward with wisdom.

    • I “forced a confrontation” with my supposed bio family that I found through Ancestry and they were more than thrilled that I found them. They call or text every day and are hoping and praying that the DNA comes back that I am their daughter/niece.

  12. Roberta, I am a bit confused. Seems the more I look at things, the more confused. I understand ancestry and their timber, but am stuck with their relationship catergories. Mine says that me and my niece share 1347 cM and she is a close relative to first cousin, but that range starts at 1450. At gedmatch it is 1393 at 78 segments so I think it is niece, but it does not fit in the tables. Can you please provide some guidance?

    Thank you, dm

    • Ancestry strips out segments. You don’t know how much, and there is no way to tell. My suggestion is to utilize the GedMatch or Family Tree DNA results instead. I have a new combined table coming very shortly. She fits at the low end of the spectrum.

  13. Greetings! We were contacted by someone who was adopted and had a close family-1st cousin match to my husband on Ancestry and we both loaded the raw DNA into gedmatch as well as my sons. We have been treating my husbands/sons match as a half brother/half uncle due to the results. Am I correct in doing so?
    Match to my husband is:
    >7cm
    1513.5 total shared cm
    174.1 largest segment
    33 matching segments
    Match to my son is:
    >7cm
    913.8 total shared cm
    166.6 largest segment
    22 matching segments

    My husband is 43, son is 19 and he(match) is 47. My husbands father has been told about this match but at this point in time is not interested in doing a DNA kit or quite frankly not interested at all. We had my husbands full brother do a DNA kit which was just received by Ancestry so we are awaiting the results which will be awhile.
    The adoptee had no information about biological family and was adopted in California. We have been building a relationship with him and his family over the last few weeks and welcomed them all with open arms. I would just like reassurance I guess or am I wrong? I really want to help him in any way I can.

    Thank you so much for all you do!

  14. Roberta, is there any prediction chart or tool for x DNA? As it happens both of my mom’s closest deadends are on the x path so I am highly motivated to understand x better. All I really know is x segments persist longer so the regular autosomal predictions won’t work.

  15. Yes, segments smaller than about 30 cm can persist for many generations. In fact, I’ve seen segments as large as 50 cm come down from the 17th century.

    What is the danger in that? People need to KNOW it, but there’s no particular DANGER!

    That’s actually very convenient if you need to solve a 17th or 18th century genealogical problem!

    There are special problems with researching at that genetic distance, and adoptees obviously don’t want to begin by bothering with it if they can possibly avoid it; when I help people identify their parents, I obviously focus on the recent relatives and places that usually jump right out at one!

Leave a Reply to DominnicaCancel reply