Big Y-500 STR Matching

Family Tree DNA recently introduced Big Y-500 STR matching for men who have taken  the Big Y-500 test. This is in addition to the SNP results and matching. If you’d like an introduction or definition of the terms STR and SNP, you can read about SNPs and STRs here.

Beginning in April 2018, Family Tree DNA included an additional 379+ STR markers for free for Big Y testers as a bonus, meaning for free, including all earlier testers.

While the Big Y-500 STR marker values have been included in customers’ results for several months, unless you contacted your matches directly, you didn’t know how many of those additional markers above 111 you matched on – until now.

If you haven’t taken the Big Y test, the article Why the Big Y Test? will explain why you might want to. In addition to the Big Y results, which refine your haplogroup and scan the entire gold standard region of the Y chromosome looking for SNPs, you’ll also receive at least 389 Y STR markers above the 111 STR panel for total of at least 500, for free – which is why the name of the Big Y test was changed to the Big Y-500. If you haven’t tested at the 111 marker level, don’t worry about that because the cost of the upgrade is bundled in the price of the Big Y-500 test. Click here to sign in to your account and then click on the blue upgrade button to view pricing.

Big Y-500 STR Matching

To view your matches and values above the traditional 111 makers, sign on to your account and click on Y DNA matches.

You’ll see the following display.

Y500 matches

The column “Big Y-500 STR Differences” is new. If you have not taken the Big Y-500 test, you won’t see this column.

If you have taken the Big Y-500, you’ll see results for any other man that you match who has taken the Big Y-500 test. In this example, 5 of this person’s matches have also taken the Big Y-500 test.

What Are Big Y-500 STR Differences?

The “Big Y-500 STR Differences” column values are expressed in the format “4 of 441” or something similar.

The first number represents the number of non-matching locations you have above 111 markers – in this case, 4. In the csv download file, this value is displayed in the “Big Y-500 Differences” column.

The second number represents the total number of markers above 111 that have a value for both of you – in this case, 441. In other words, you and the other man are being compared on 441 marker locations. In the csv download file, this value is displayed in the “Big Y-500 Compared” column.

Because the markers above 111 are processed using NGS (next generation sequencing) scan technology, virtually every kit will have some marker locations that have no-calls, meaning the test doesn’t read reliably at that location in spite of being scanned several times.

It’s more difficult to read STRs accurately using NGS scan technology, as compared to SNPs. SNPs are only one position in length, so only one position needs to be read correctly. STRs are repeated of a sequence of nucleotides. A 20 repeat sequence could consist of 20 copies of a series of 4 nucleotides, so a total of 80 positions in a row would need to be successfully read several times.

Let’s take a look at how matching works.

How Does Big Y-500 STR Matching Work?

If you have a total of 441 markers that read reliably, but your match has a total of 439 that produced results, the maximum number of markers possible to share would be 439. If you both have no calls on different marker locations, you would match on fewer than 439 locations. Here’s an example just using 9 fictitious markers.

Y500 match example

Based on the example above, we can see that the red cells can’t match because they experienced no-calls, and the yellow cells do have results, but don’t match.

Y500 summary

New Filter

There’s also a new filter option so you can view only matches that have taken the Big Y-500 test.

Y500 filter

Let’s look at some of the questions people have been asking.

Frequently Asked Questions

Question 1: Are the markers above 111 taken into account in the Genetic Distance column?

Answer: No, the values calculated in the genetic distance column are the number of mismatches for the marker level you are viewing using a combination of the step-wise and infinite alleles mutation models. (Stay with me here.)

In our example, we’re viewing the 111 marker level, so the genetic distance tells you the number of mismatches at 111 markers. If we were viewing the 67 marker level, then the genetic distance would be for 67 markers.

The number of mismatches above 111 markers shows separately in the “Big Y-500 STR Differences” column and is calculated using the infinite alleles model, meaning every mutation is counted as one difference. You can read more about genetic distance in the article, Concepts – Genetic Distance.

The good news is that you don’t need to calculate anything, but you may want to understand how the markers are scored and how the genetic distance is calculated. If so, go ahead and read question 2. If not, skip to question 3.

Question 2: What’s the difference between the step-wise model and the infinite alleles model?

Answer: The step-wise model assumes that a mutated value on a particular marker of multiple steps, meaning a difference between a 28 for one man and a 30 for another is a result of two separate mutation events that happened at different times, so counted as 2 mutations, 2 steps, so a genetic distance of 2.

However, this doesn’t work well with palindromic markers, explained here, where multi-copy markers, such as DYS464, often mutate more than one step at a time.

Counting multiple mathematical differences as only one mutation event is called the infinite alleles model. For example, a dual copy marker that has a value of 15-16 could mutate to 15-18 in one step and would be counted as one mutation event, and one difference and a genetic distance of one using the infinite alleles model. The same event would count as 2 mutation events (steps) and a genetic distance of 2 using the step-wise mutation model. In this article, I explain which markers are calculated using which methodology.

Another good infinite alleles example is when a location loses it’s DNA at a marker entirely. If the marker value for most men being compared is 10 and is being compared to a  person with no DNA at that location, resulting in a null value of 0 (which is not the same as a no-call which means the location couldn’t be read successfully), the mutation event happened in one step, and the difference should be counted as one event, one step and a genetic distance of one, not 10 events, 10 steps and a genetic distance of 10.

To recap, the values of markers 1-111 are calculated by a combination of the step-wise model and the infinite alleles model, depending on the marker number and situation. The differences in markers above 111 are calculated using the infinite alleles model where every mutation or difference equals a distance of one unless a zero (null) is encountered. In that case, the mutation event is considered a one. However, above 111 markers, using NGS technology, most instances where no DNA is encountered results in a no-read, not a null value.

Question 3: Has the TIP calculator been updated?

Answer: No, the TIP calculator does not take into account the new markers above 111. The TIP calculator relies upon the combined statistical mutation frequency for each marker and includes haplogroup differences. Therefore, it would be difficult to compensate for different numbers of markers, with various markers missing for each individual above 111 markers. The TIP calculator only utilizes markers 1-111.

Question 4: Do projects display more than 111 markers?

Answer: No, projects don’t display the additional markers, at least not yet. The 111 marker results require scrolling to the right significantly, and 500 markers would require 5 times as much scrolling to compare values. Anyone with an idea how to better accomplish a public project display/comparison should submit their idea to Family Tree DNA.

Question 5: Which markers above 111 are fast versus slow mutating?

Answer: Results for these markers are new and statistical compilations aren’t yet available. However, initial results for surname projects in which several men who share a surname and match have tested indicate that there’s not as much variation in these additional markers as we’ve seen in the previous 111 markers, meaning Family Tree DNA already selected the most informative genealogical markers initially. This suggests that the additional markers may provide additional mutations but probably not five times as many as the initial 111 markers.

Question 6: Why do I have more mutations in the first 111 markers than I do in the 389+ markers above the 111 panel?

Answer: That’s a really good question. You’ve probably noticed in our example that the men have dis-proportionally more mutations in the first 111 markers than in the markers above 111.

Y500 genetic distance

The trend is clearly for the first 111 markers to mutate more frequently than the 379+ markers above 111. This means that the first 111 markers are generally going to be more genealogically informative than the balance of the 379+ markers. However, and this is a big however, if the line marker mutation that you need to sort out your group of men occurs in the markers above 111, the number of mutations and the percentages don’t mean anything at all. The information that matters is how you can utilize these markers to differentiate men within the line you are working with, and what story those markers tell.

Of course, the markers above 111 are free as part of the Big Y-500 test which is designed to extract as much SNP information as possible. In essence, these STR markers are icing on the cake – a treat we never expected.

Bottom Line

Here’s the bottom line about the Big-Y 500 STR markers. You don’t know what you don’t know and these 379+ STR markers come along with the Big Y test as a bonus. If you’re looking for line-marker STR mutations in groups of men, the Big Y-500 is a logical next step after 111 marker testing.

_______________________________________________

Disclosure

I receive a small contribution when you click on the link to one of the vendors in my articles. This does NOT increase the price you pay, but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

18 thoughts on “Big Y-500 STR Matching

  1. Thank you yet again Roberta. I had been pondering all questions covered. I’m also wondering if you think there is any value in going back to check12-marker matches with respect to noting possible connections to surnames known to history before mine (Routledge/Rutledge) first appears in 1400s. Example: 500+ matches at 12-markers with 100+ having taken BigY500. I was delighted to find ancient surnames such as Sutherland, Cummins, Sinclair. I know a bit about convergence and that they are not recently related, but I do assume that there was a MRCA somewhere back within the BIGY500 estimated time frame. If you have any thoughts on this; I would much appreciate your comments.

  2. Good article Roberta. It explained many of the things that I was wondering about. I have only one match in the 67 marker group that has also done the Big Y-500, and none in the 111 marker group. I was hoping that these additional markers would provide further insight but I suppose that it is a matter of waiting for more people to test.

  3. Hi Roberta! Thank you. Would twin brothers have 100% identical Big Y 500 markers? Trying to decide here if it makes sense to pay for 2nd twin to test beyond 37 markers. Goal here is to identify last name and country, going back 10 generations or so (brick wall is early 1700s at U.S. immigrant ancestor’s parents).

    • Yes, or if there is a difference it’s due to sequencing. If they are fraternal twins and there’s a difference, it happened in this generation between father and son. So no help genealogically.

  4. I have been using Y-STRs and FTDNA’s great database of men to help find adoptee’s paternal surnames. If an adoptee tests BigY and has a couple matches beyond 111 markers into the BigY zone, does this give him any more certainty above 111 markers as to what his biological paternal surname is? Both BigY matches have the same surnames.

    I’m just wondering if this upgrade is useful for the adoptee crowd or not.

    • Oh, yes, it’s absolutely useful for adoptees. The two tests combined if men match on both is a great tool. However, men can vary on both STR and SNP markers and still be closely related, and the reverse too. Use Y in combination with autosomal for adoptees.

  5. Your post does a good job of explaining the issues involved in 500 STR matching vs. 111 marker matching, and you have also covered Big Y in other posts, but in spite of reading these posts I am still wondering how best to use my results from these three Y DNA tests. I only have one person who shows up in all of these tests for me, and even then he only shows up at 67 markers and not 111 markers, and at a G.D. of 7. I have no close matches. The only ray of hope for making sense of these matches, is that the 5 matches I have for my terminal SNP for the Big Y test, all say that they have Doan(e) paternal ancestry. I take this to indicate that there may have been an NPE between the Doan(e)s and the Smiths in the distant past, but probably very far back. Any advice for me?

    • These types of situations lend themselves more to an individual analysis. I can’t really do this without seeing the results themselves, which is why I created the quick consults for people.

  6. Honestly I doubt the Y500 will ever amount to much. In contrast to conventional wisdom, in all the results I’ve ever seen, SNP changes are substantially more frequent than STR changes (even on the Y111 level).

    I’ve compared about twenty Y500 results where the common ancestor is known to be more than 2000 years ago. I did not see any changes in any of the Y500 markers with 12 or fewer repeats. There are about 60 markers with double-digit counts, and only a dozen with counts above 12. In projects, these dozen could be added to the end of the Y111 table, I think. The ones with a rep count of 4 or 5 are completely useless I think.

    The Y500 changes only seem to happen about once every 500 years or so.

    By contrast, an SNP seems to happen extremely close to once every hundred years.

    In summary — it’s more of a Y150 test, than a Y500. Only a little more info than Y111.

  7. BTW on your blog, you generally present BigY as being relevant only for the distant past. This is not true! I discovered who my biological grandfather was, through BigY. (Subsequently confirmed by autosomnal DNA). This is probably the only time that’s happened so far, but BigY put me in contact with immediate family members who are still alive. As more people are tested, this will probably start happening more often.
    BigY also gave me a 21 generations of well-documented patrilineal ancestors, all the way back to the 1200’s (yes, really!) so I got the absolute best-case scenario.

    I am the ultimate fan of Big Y.

    • You would also have made the same discovery using the STR markers. Or are you saying you’re not an STR match to him? That would be virtually impossible.

  8. No, I would not have made this discovery on the basis of STR markers. I didn’t tell you the full story. My BigY matches were all to people with common ancestors many hundreds of years ago.
    Note that this is a situation where you do not know the surname – even worse, you are certain that you know the surname, but you’re wrong!

    Even at Y111, my 6 closest matches have 5 different surnames. On Y67 it’s worse, only 7% of the matches have the surname. I would have just said, “I have no relevant matches, they are all spurious”.

    BigY does not have that ambiguity. And it gives you some clue of the relationship between your matches. That was crucial in my case.

      • Quite a few had taken BigY, at least enough to show that the list of Y111 matches is a bit deceptive. In particular, among the projects I am involved with, the marker DYS712 is *completely* unstable, and CDY is terrible as well. I wish I could remove those two from consideration of match quality, especially DYS712, it is worse than useless (and it should be displayed in brown on the FTDNA site).

        When you’re looking for very distant matches, these rapidly-changing markers are a huge problem.

Leave a Reply