Big Y Release

Drum roll…the big day is finally here.

Family Tree DNA held a webinar meeting today to explain the new Big Y product features for a number of us who blog or otherwise educate within the genetic genealogy community.

First, the results will begin rolling today, not tomorrow.  100 will initially be released today and the balance of the initial orders will be released as they finish QA over the next month, at which point, Family Tree DNA anticipates their backlog will be resolved.  There were thousands of tests ordered.  They aren’t saying how many thousands.

First, a little background.  There are 36,562 known Y SNPs in the Family Tree DNA data base that everyone is being compared to.  In the example we saw of the delivered product, 25,749 has been found and callable at a high confidence rate in the individual being tested and were reported.  Low confidence calls are not reported on this personal delivery page, but are included in the data download files.

Big Y landing

On the customer’s personal page, there are two tabs.  The first Tab is for reporting against known SNPs.

Y page 1 cropped

The second is for Novel Variations, in other words, SNPs not on the list of 36,562 known and previously named SNPs.

Y page 2

In essence, Family Tree DNA has implemented a 4 step process.

  1. An individual’s sequenced data is compared to the SNP data base and divided into two categories, known and previously unknown.  The customer’s data is delivered based on these two categories.
  2. All customer data is being loaded into a mammoth size data base at which point it will be determined which SNPs (please see the definition of a SNP here) are actually undiscovered SNPs that will be named, and which are truly novel, family or clan variants.
  3. New SNPs that are found in enough of the population will be named and will be added to the haplotree.
  4. Novel variants will remain that, and will continue to be reported on client pages.

Family Tree DNA is still working on items 2-4.  In addition, they are working on a white paper which will be out in the next 6 weeks or so that will discuss things like the average number of novel SNPs per person being discovered, mutation rates, performance metrics and cross validation of platforms between the next gen sequencing Illumina equipment, Sanger sequencing and chip based sequencing, like the Geno 2.0 chip.

What’s Being Reported?

According to Dr. David Mittelman, the Y chromosome has about 60 million letters.  About half of those are inverted repeats and are therefore not sequenceable.

Of the balance, there are several with poor readability, for example, some that simulate the X, etc.  These are also not useful or reliable to read.

That leaves about 10 million, these being the gold standard of Y sequencing.  Family Tree DNA tries to read about 13.5 million of these base pairs.  They promised 10 million positions when they announced this product.  They are delivering between 11.5 and 12.5 million positions per person.  They also promised about 25,000 common variants, meaning known SNPs and they are delivering between 25,000 and 30,000 per person.  This is only counting medium to high confidence calls.  The low confidence calls are included in the download files, but not counted in this total or shown on your personal page.

Exactly how many locations are reported for any individual are shown on the bottom left hand side of the page.  This example is generic.  Yours might say something like, “Showing 1 of 10 of 25,000 of 36,564.”  In this case, 25,000 would be the number of SNPs read and called on your test.

Big Y total

All 25,000 or so results are being shown, both positive and negative.  That way, there is no question about whether a specific location was tested, or the outcome.  Of course, the third and fourth outcome options are a no-call or poor confidence call at that location.

All novel mutations are being reported by reference number so that they can be compared to like data from any source, as opposed to an “in-house” assigned number.

Insertions and deletions are also in the download files, but not reported on the customer’s delivery page.

Personal data is also searchable by SNP.

SNP search

Individual SNP Testing

After steps 2 and 3 have occurred, it has to be determined which SNPs are found in a high enough percentage of a population to warrant primer development to test individual SNP positions.

Family Tree DNA also clarified something from the November conference.  The 2000 SNP limit is only how many SNPs can be loaded at one time, not the total number they will ever develop primers for or test for.  They will do what makes sense in terms of the SNP being present in enough of the market to warrant primer development.  With the very large number of Novel SNPs being discovered, it wouldn’t make much sense to purchase 50 individual SNP tests at $39 each.  The break even point today, at $39, would be 17 individual SNPs, as compared to the $695 Big Y test.  I expect that eventually the demand for individual SNP testing will decrease substantially.

Downloadable Files

Available on everyone’s page is the ability to download 2 files, a VCF (variant call file) which lists the variants identified as compared to the human reference sequence and the BED file which is a text file which shows a range of positions that passed the QC.

They will also be making available the BAM raw data files within the next week or so, but are finalizing the delivery methodology due to the very large file sizes involved.

The Much Anticipated HaploTree

If I had a dollar for every time someone has asked when the new tree would be available, I’d be a rich woman.  As we all know, there have been a couple of problems with the tree.  The new tree is 7 to 8 times the size of the 2010 tree.  The tree, of course, has been cast in warm jello, an ever-moving target.  And with the SNP tsumani that has been arriving with the full sequencing of the Y chromosome, that tree will very shortly be much larger still.

Bennett Greenspan said today that an updated tree is, “Needed, desired and will be delivered.”  He went on to say that they have had two teams working together with Nat Geo for the past couple of months to both finalize the tree itself and to work on the customer interface.  Since the tree is much larger, it’s not as easy as the older trees which could be seen at a glance and easily navigated.  Furthermore, there is also the matter of integration with National Geographic.

Bennett says an updated tree will be delivered “within the next several weeks.”

New SNPs that are discerned to be SNPs and not novel/clan or family variations will then be named and added to the tree.

Integration

The initial release of Big Y data will be just that, a release of the results of the data, displayable on your personal page and downloadable.  The newly found SNPs will not initially update the current haplotree on your personal page.  This is the same issue we have today with the transfer and integration of Nat Geo data, because the tree is not current, so this is nothing new.  The implementation of the new tree however, will remedy both problems.

The Future

Never happy with what we have, genetic genealogists will want a way to match to other people on SNPs, just like we do today with STR markers.  In fact, we’ll want a way to integrate that matching and discern what it means to our own private family or clan situations.

Family Tree DNA is aware of that, planning for it, and welcomes feedback for how they can make this information even more useful in the future than it is today.

New Orders

I expect this delivery of new information via Big Y results will indeed spur a new interest in ordering this test from people who were waiting to see exactly what was being delivered.  For those people ordering now, they can expect an 8-10 week turnaround, so long as additional vials aren’t required for testing.

For More Information

Elise Friedman is holding the free Big Y Webinar tomorrow, Friday, February 28th.  You can read about it, sign up and learn how to access this and other webinars after their initial showing at this link.

Family Tree DNA FAQ pages you’ll want to visit are here and here.

17 thoughts on “Big Y Release

  1. As the results are released, will the individuals tested be notified in any way or will we just have to periodically check the FTDNA page?

  2. As one of the not-100, I did notice a new BIG Y results section on my page. It’s in the larger area of the other results section on the right side of the page (mtDNA, Y-DNA, and FamilyFinder). Right now it only takes me to the Pending area, same as the tab in Other results tab.

  3. I’m a bit peeved by this 100 release limit. We were told mid February, then February 28th explicitly. It seems like nothing FTDNA says anymore can be held as true?!? Not a good way to build up customer confidence at a time when the support at the company has been flaky at best, downright crappy at worst over the past year–PEOPLE WANT TO BE TOLD THE TRUTH!!! :( Add to that project admins who have been passing along what we are being told and now WE look like LIARS too!! :(

    • There is a theory that FTDNA intentionally picked the 100 best results and released them “in advance” before the other results. Reason? Trick people that have not ordered BigY yet that the BigY find many, many new SNP’s. Hopefully the theory is not the truth, but you never know. All I can say is don’t order until the rest is released and when we got a true new SNP hit count from the test.

      • Well of the three I saw yesterday, one had only 25 novel variants, one had 241 and the other over 600, so if they were going for the “best” they would not have included the one for 25 I wouldn’t think.

      • robertajestes: Well, you are correct if the truth is that 25 novel variants is a “low score” on BigY. If 25 is above average, then including the result in the first 1000 was a strategically correct move from FTDNA (if the theory is true.)

  4. I am a Big Y Test customer who has not yet received his test results. I have asked the questions below on the FTDNA Forum but have not seen any reply, as well I am waiting for the Customer Service Desk to provide answered for the following two questions:

    1. Does FTDNA have the materials (including the reagent that caused the delay) in its possession required to complete all Big Y Tests by 28 March 2014?

    2. To my understanding there has only been a general date of 28 March 2014, in which FTDNA expects all Big Y Tests to be completed. Will FTDNA be sending out a more detailed list of when tests will be processed?

  5. Hello everyone,

    Just wanted to comment on my 5 March 2014 post, I have had some questions asked regarding it, which has made me realize that I did not clearly articulate the situation. I had been waiting for a response to some questions I posed on the FTDNA Forum a few days prior, however I had contacted the FTDNA Customer Service email just prior to adding comments to this Blog.

    On a positive note the FTDNA Customer Service representative responded to my email in a very timely manner, in which it was confirmed that FTDNA does indeed have all the materials (including reagents) in their possession required to test the remainder of the Big Y customers. As well, FTDNA are expecting to be releasing a more detailed timeline for the remaining Big Y Test results releases shortly.

  6. I have received my Big Y Results and am very glad I signed up for the test. Prior to taking the Big Y, I knew that I was L1065. Results have immediately taken me 4 SNPs downstream of L1065. On top of this I have the Big Y has me positive with a high confidence for an additional 82 Novel (variant) SNPs. Once these novel/variant SNPs are compared with the results of others (which may take some time) much will be learned about my paternal linage.

    Although the Big Y had a rocky start this time around (remained of Big Y test results are scheduled to be completed prior to 28 March 2014) I am extremely satisfied with the product that FTDNA has created.

    Once all the novel/variant SNPs are sorted out, I believe that the Big Y will be the most beneficial genetic genealogy test on the market. From my personal experience with both my currently known results and the expected follow-on information that is expected, I highly recommend this product to anyone interested in their paternal ancestry. Well worth the investment.

  7. Pingback: Haplogroup Comparisons Between Family Tree DNA and 23andMe | DNAeXplained – Genetic Genealogy

  8. Pingback: Big Y Chrome Extension | DNAeXplained – Genetic Genealogy

  9. Pingback: 2014 Y Tree Released by Family Tree DNA | DNAeXplained – Genetic Genealogy

  10. Pingback: Haplogroups, SNPs and Family Group Confusion | DNAeXplained – Genetic Genealogy

  11. Pingback: Sylvester Estes (c1522-1579), Fisherman of Deal, 52 Ancestors #29 | DNAeXplained – Genetic Genealogy

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s