Y DNA Match Changes at Family Tree DNA Affect Genetic Distance

Recently, group administrators received information that Y matching has changed at Family Tree DNA.

GD1

This is a welcome update.

The new changes reflect less restrictive matching algorithms, reflecting knowledge gained about how mutations on the Y chromosome occur.

These new matching algorithms also affect the calculation of genetic distance. I wrote about genetic distance here, and this new information supplements the original article.

All changes result in less restrictive matching. Therefore, if you notice any changes at all, you should have additional Y DNA matches, not fewer, whether as a result of your own marker values of those of someone you now match, but didn’t before.

Normal Matching

Normally, if person 1 has a value of 12 and person 2 has a value of 14, on any marker, the genetic distance is counted as 2, the difference between the two values.

GD2

The new changes vary from the normal matching, depending on the marker and the values.

Null Value Markers

When a marker has a null value, meaning a value of 0, that marker will be counted as one difference when compared to other markers with numeric values.

GD3

The new genetic distance calculation of 1, when one individual has a marker value of zero, has been implemented to reflect that the mutation resulting in the deletion of one individual’s DNA at that location likely happened in one step, not in several.

Null values are most often seen on marker 425, but can appear elsewhere as well. All null marker values are treated in this same manner.

Dual Value Markers

Most markers with hyphenated values are being treated less restrictively. Family Tree DNA has provided the list of markers affected by this change, below.

GD4

Matching now looks at the total difference of the two values combined, not the difference at each hyphenated value individually. In other words, the order of the values no longer matters.

GD5

There are two changes in the above calculation when any two values are the same.

  • Change 1 – The common values cancel each other, regardless of where they appear in the marker.
  • Change 2 – The genetic distance is now 1 if there is a difference in the remaining markers, instead of the previous 3, in this example. In other words, the value of 1 reflects that there is a genetic distance and does not assume that the mutation occurred in 3 discrete steps.

However, in the instance where any two values are NOT the same, a different matching routine is involved.

GD6

In this case, the genetic distance is 2 because there are no common values to cancel and the mutations are much more likely to have occurred discretely.

Marker 464

Marker 464 typically has 4 values, 464a, 464b, 464c and 464d. However, this marker can be found with from one to several additional values, such as 464e, 464f, etc.

GD9

In the event where the common marker values are the same, above, the fact that one person has additional markers, regardless of how many, is counted as one difference, because the mutation that created these additional markers likely happened at one time.

GD8

In the event where the common marker values are not the same, as shown above, common values are cancelled, with the nonmatching values being counted as one genetic step, the same as in the dual value marker example above.  In this case, one genetic step is assigned for the 4 extra markers, and one additional step for the difference between markers 464b and 464c, for a total genetic distance of 2.

Thanks to Family Tree DNA for providing this additional information.

27 thoughts on “Y DNA Match Changes at Family Tree DNA Affect Genetic Distance

  1. Pingback: Concepts – Genetic Distance | DNAeXplained – Genetic Genealogy

  2. Pingback: Concepts – Y DNA Matching and Connecting with your Paternal Ancestor | DNAeXplained – Genetic Genealogy

  3. Roberta,
    I am not sure that I clearly understand where the GD of 2 comes from on the last example for marker 464. I understand where 1 GD comes from the additional markers for 464e through 464h. Your explanation for the additional marker confused me a little. You said it was caused by the difference between 464b and 464c. Don’t the similar values in 464c cancel out and the the differences in the values for 464b between the two people cause the additional GD? Am I looking at this incorrectly?

    Thanks,
    BillR

    • You are correct that 1 comes from the 4 extra markers. The second 1 comes from the fact that after cancelling, there is a difference in the values of at least two of the markers. The fact that there is a difference is counted as 1. So the maximum difference you can have for 464 is 2, 1 for extra markers and one for a math difference in any of the matching markers.

  4. Roberta: Have the changes taken place already? I have downloads of my Y12, Y25, Y37, Y67 and Y111 matches from June 12. When I compare to today’s downloads, none of the genetic distances of anyone changed. Or are they still coming?

  5. If I read this correctly, the change has not yet occurred, but its implication in near future has been announced to group administrators. Is there a reason FTDNA has not informed all Y-DNA customers?

  6. Thanks so much, you have a good way of explaining things. I do wonder if for the Y-37 test, there are a total of five markers with changed algorithms, i. e., DYS 19, DYS 385, DYS 459, CDY, and YCAII?

    • Yes, but these are not all in the 37 marker panel. They are scattered between the panels. Most people won’t see much if any change at all. For the most part, either you are already matching on these markers, or you are already mismatching on more than the threshold, so these markers aren’t the ones to make the difference. The most dramatic changes will be for someone with a null value.

  7. I am one of those with multiple nulls. Which way will this affect me? We may be a small number, but those nulls can indicate significant markers for our families and connection with others with the same or nearly the same results.
    I saw a chart recently which I can not now locate that has the modes for each allele and the percentage of variation from the mode. Since the nulls once had a value before mutating assigning them a GD of 1 can be a disservice. Some of those nulls could very well have been the high percentage mode and match exactly. I understand that is impossible to know what the missing value was, but I am concerned how this method will affect us. Positively or Negatively?
    Three members of my family have 14 nulls in our Y-67 results. I am an administrator of a group with similar results. I would appreciate learning more so I can explain the situation better. The uniqueness of our results have good and bad implications. It appears to me that it is a marker that can tie families together because of the uniqueness. Most of those I have found have different surnames, but I am discovering that those names are showing up in marriages and in close proximity to one another.
    So “The people most likely to see a difference are those will null values and they are a small percentage.” What differences are you referring to.
    Is there a source or resource I can tap into learn more about the affect of Nulls on our GD and other ramification in our genetic research?

    • Wow, in all the years I’ve done the Y DNA reports, I’ve never seen anyone with that many null values. You should still match your people with null values exactly. Others will have a GD of 1 for each null, so it won’t take many to disqualify them from being listed as a match. Don’t have any more information other than what was provided. It will be interesting to see how the new technique affects someone with so many null values. The more common scenario is when a null has developed mid-lineage so men who should match aren’t matching. If you are in a null project for your various markers, you will still be able to see within the project who matches whom on which markers. So, how has this change affected your matches?

  8. Thank you for your response! I would like to get in touch with a university grad student looking for a Doctoral thesis project in genetics! This would be a an unusual opportunity to focus on this unique genetic result! My oldest son is a graduate of Cal Tech in Electrical Engineering and worked for Apple for 26 years, my other son works for Intel in computer forensics, I have a masters in Education and was a school teacher for 30 years, now retired. We aren’t worms! So I guess the missing DNA were not life threatening. My dad and grandfather lived to age 93! Maybe that is the benefit of those particular missing alleles!

    I haven’t seen a decrease in matches at the 25, 37, 67 levels. Because of they way nulls are being treated, 14 of them rules out many who could be perfect matches. So, what the effect of the new procedures will be are unknowable. I’m sure I should have many more hits with a GD of 0 in the 25, 37 and 67 range. I have not tested to 111. The nulls do not appear until the 37 test. So many people opt for the DNA-12 test which is just a wast of money as far as I’m concerned!
    If only there was a way for me plug in the Mode for the nulls to find matches or rather potential matches to research and perhaps make connections. Joining a project as things stand now my original test results are automatically uploaded to Project. There must be a way. My group may be small, but we should not be ignored!!!!!! My paper trail goes back many generations and very possibly to Rollo and the early royal of England and France. Finding the source of the mutation could have ramification for future genealogists who run into our unique situation.

  9. I haven’t seen any changes yet. I did an experiment using Y-Utility: Y-DNA Comparison Utility,FTDNA 111. I created a spreadsheet of my null group including myself. I then created another copy of my null results, but replace the nulls with the mode values for the respective tested alleles. I copied the spreadsheet into the Y-Utility: Y-DNA Comparison Utility, FTDNA 111 and clicked on the expedite button. The results for the null set and the “mode” set made no difference in the results. That really surprised me. I was expecting to see more matches!

    I would like to have someone looking for a doctoral thesis in genetics to take on the effect of multiple nulls. I believe the research could have significant implications so far unknown because of the uniqueness.

    At this point the idea that this group is so small that it is insignificant and not worthy of consideration. I believe that is a big mistake and I hope someone with a lot more knowledge of genetics and genealogy would consider what could be learned and applied to how we compare results to the majority of tests that come back with no nulls or the normal DYS425 and a few others.

    My Prouse/Prowse Project on FTDNA was started because there wasn’t a Prouse/Prowse Project. When I received my results and then the same results for my sons it has morphed into a Multiple Null Project. I/We welcome and any all who have multiple nulls to join the project.

    Most of those who have joined do not carry the Prowse, Prouse, Preaux, Pragtellis, Pratt, Prouz, Prowze, Pross, etc. Son, how do these other names fit into the family. I have found many of the names in close proximity to each other. One of the closest matches is to a Chappell/Chapple from 17th century Devonshire, England. Other names of interest that have multiple nulls are: Tuscan, Gaskill & Gascoigne with 0 GD. Other names with 1,2,3, or 4 GD are Reed, Hicks, Richards, Nisson, Hatcher. Allied families who are associated with these surnames include Winters, Lippincott, Knight, Wells, and Wentworth.

    I have multiple connections through my direct Prouse line back to the royals of England and France as well as such notables as William Marahal, “The Greatest Knight”, Rollo, William the Conqueror, and many others. The genetic testing of Rollo’s grandson which is currently being studied is of interest.

  10. Thank you very much for this clarifying article! But I still have a question. When person 1 has DYS448=19 and person 2 has DYS448=19.2, is their genetic distance then zero, 1 or 0.2 ?

    • The micro-alleles represented my the .x numbers are rounded to the nearest number for calculations. In your example, 19.2 would be rounded to 19 and compared to others as such. If all the men had 19.2, they would all be counted as equal, as would they with any man who was 19 with no micro-alleles.

      • Thank you for this useful reply. I could be wrong but I rather have the feeling that the mathematical rounding off (19.4 to 19 but 19.6 tot 20) for the calculation of the genetic distance; is more based on a convention then on a law of nature.

      • No one else reports or ever has reported them at all. I have found that while they are interesting, they have never made a difference in any report that I’ve worked on. But knowing they exist gives you the opportunity to ask the people you match whether they have them or not.

  11. Hello Roberta,
    My name is Clovis La Fleur. I’m an administrator of the Stark Family Y-DNA Project. I’ve been doing TMRCA Tip calculations related to my project for quite a few years. I’ve recently been doing some updates after the Genetic Distance changes by FTDNA occurred. I noticed something different in the Tip probability calculations relative to the CDY markers in the 37 marker panel.

    When I compare two persons who are a perfect 37 Marker match to each other, Gen 1 begins with 36.26%. Previously, when I compared person 1 at the CDY markers with allele values 36-38 to person 2 with allele values 35-38, Gen 1 began with 12.01%, subsequent values over 24 generations lower in value than those of a perfect match. FTDNA reported a genetic distance of 1 for the two persons compared.

    Yesterday, I repeated the comparison of Person 1 to Person 2, reported by FTDNA to have GD=1 and found Gen 1 was now 36.26% and all of the percentage values for Gens 1 through 24 were identical to those of a Perfect Match between two any persons compared.

    I understand the GD changes you sited above. However, what have I missed relative to the CDY marker? It appears a mismatch of 1 at either CDY marker, while all other markers are a perfect match, now results in aPerfect match TMRCA percentage calculations over the 37 marker panel. This occurred with FTDNA declaring Person 1 and Person 2 had a genetic distance of 1.

    Appreciate your comments above!!

    Clovis La Fleur

      • It appears they did change the TIP calculations. My suspected 3rd cousin was previously shown as having roughly a 28% probability of an MRCA within 4 generations, but he now shows roughly 57%. That’s quite a change.

  12. Hi Roberta

    I’ve been trying to find out why one of my matches has disappeared. It was previously a 25 GD 2 match. Customer Support have suggested it might now be a GD 3.

    After reading your article about the 2016 changes, I wonder if it should still be GD 2 match?

    Man 1 (M1) & Man 2 (M2) match exactly on 30 out of the 37 markers (including all but one of the fast-movers).

    Here are the markers with differences:

    Panel 1:
    DYS390 M1 – 25 M2 – 26
    DYS391 M1 – 11 M2 – 10

    Panel 2:
    DYS459 M1 – 9-9-10 M2 – 9-10

    Panel 3:
    DYS607 M1 – 15 M2 – 16
    DYS570 M1 – 18 M2 – 19
    CDY M1 – 35-35-40 M2 – 36-40
    DYS442 M1 – 15 M2 – 14

    I would be pleased to hear what you think this match should be listed as.

    Thanks,
    Caroline

  13. In looking at the list of two value markers supplied to you by ftdna, it is not the same as those that actually have two value markers in any 111 marker test that I’ve seen. The list supplied is as follows:
    DYS19, DYS385, DYS459, YCA11, CDY, DYF395S1, DYS413, DYS425.
    However, looking at a y-dna 111 test, the ones with the dual value markers are as follows: DYS385, DYS389, DYS459, YCA11, CDY, DYF395S1, DYS413.

    So, DYS389 that is in the 111 test results as a dual value marker is missing from ftdna’s list. And DYS19 & DYS425 that are in the ftdna list do not appear as dual value markers in the 111 test results.

    What am I missing, or is the ftdna list incorrect?

    • Hi Robert. DYS389 is treated as two markers, so it’s not treated as a single marker with two values. 425 often has zero, which is also treated as an infinite allele. DYS19 is technically a dual value marker, but very few people have two values. When they do, it’s simply shown with two values. I have seen a few of these in the reports I’ve done for clients. Before I posted this, I checked with FTDNA to be sure of the answers.

  14. Hi Roberta,

    Your article says that all changes result in less restrictive matching and suggests the FTDNA previously used step-wise calculations of gd even for the multi-copy (aka palindromic) STRs. But wasn’t the infinite allele model previously used on those STRs? Under the infinite allele model wouldn’t the max gd for a dual-value marker be 1 but under the method described above, it could be 2? And, under the infinite allele model, wouldn’t the max gd for 464 be 2 but under the method described above it could be 5 (1 for additional markers and 1 for each of the 4 shared markers if none of the numbers match)?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s