Downloading and Uploading 23andMe Files – v2 vs v3

Some days, it seems nothing is as simple as it should be.

If you recall, I in my article, “Now What, 23andMe and the FDA,” one of my suggestions was to download your raw data file from 23andMe.  You can then upload it to both www.gedmatch.com and to Family Tree DNA.  This gives you the added benefit of fishing in multiple ponds, regardless of what happens to 23andMe relative to the FDA situation.

I also mentioned that I was having a customer support nightmare with 23andMe trying to figure out what was wrong with 3 of my 5 files that I downloaded.

GedMatch had not been accepting new file uploads for a couple weeks, so I couldn’t upload there, but I did attempt to upload them to Family Tree DNA, unsuccessfully.  I checked today, and they are accepting files again now.

I subsequently discovered that the problematic files were short a significant amount of data.  In some cases, in the past, the upload problem has been that the file in question was a build 36 file that had been downloaded earlier.  The solution, in that case, is easy, simply redownload the file from 23andMe and it will be in the current built format.

However, this was not the problem with these three files.  They were build 37 as confirmed by the header records in each file.

build 37

You can see in an earlier file, downloaded in 2009, my data was in build 36 format.

Build 36

Finally after 2 very frustrating weeks working with their customer support, 23andMe confirmed that indeed, the 3 files in question were not the same length as the other 2 files, and that they were an earlier version of their product, known as v2.  This information, unfortunately, was not reflected in their product revision history, shown below.

August 9th, 2012. We updated our database to report SNP positions using the NCBI Build 37 (also known as Annotation Release 104) genome assembly. Users will see changes in their raw data positions. Read more here.

September 29th, 2011. Analysis of our data has allowed us to improve the interpretation of several SNPs. In the next week, customers may see changes in their raw data.

January 13, 2011. We updated our database to incorporate data from a more recent build of dbSNP. Some rsids have changed location and/or flanking sequence in dbSNP such that our probes are no longer meaningful to assay them. The names of these rsids have been changed in the raw data to internal ids starting with “i499…”. We have also improved the interpretation of a number of SNPs and removed others that had poor data quality. In the next couple of days, customers may see changes in calls for those SNPs.

March 25, 2010. Analysis of our data has allowed us to improve the interpretation of several dozen SNPs. A portion of the SNPs are on the mitochondrial chromosome. In the next couple of days, customers may see changes in calls for those SNPs.

October 8, 2009. Analysis of our data has allowed us to improve the interpretation of over 1500 SNPs. A portion of the SNPs are on the mitochondrial chromosome. In the next couple of days, customers may see changes in calls for those SNPs.

June 4, 2009. Analysis of our data has allowed us to improve the interpretation of over 500 SNPs. Most of these SNPs are on the Y chromosome. In the next couple of days, customers will see calls for SNPs that previously had a no-call or appeared not genotyped.

April 9, 2009. Analysis of our data has allowed us to improve the interpretation of 10 SNPs: rs4420638, rs34276300, rs3091244, rs34601266, rs2033003, rs7900194, rs9332239, rs28371685, rs1229984, and rs28399504. In the next couple of days, some customers will see calls for SNPs that previously had a no-call or appeared not genotyped.

In late 2010, 23andMe added functionality to their product that included, among other things, Alzheimer’s risk information.  I was particularly interested in this information, so even though I had tested on an earlier platform, v2, at that time, I updated to the v3 test.

In December 2010, 23andMe began using the v3 chip, so everyone who tested after December 2010 will be on the v3 chip platform.  If you tested in December 2010, you might be on either one.   If you’re on the v3 chip, no worries.  If you are on the pre-December 2010 v2 chip, your data will not be able to be uploaded to Family Tree DNA because of compatibility issues.  Family Tree DNA utilizes significantly more SNP locations, over 700,000 in total, which is 125,000 more than the v2 23andMe file.

However, GedMatch continues to accept v2 files according to site creator, John Olson.   Keep in mind that GedMatch is a free (donation based) volunteer site run by two project administrators, so when they get overwhelmed with file uploads, they shut the gate for a week or two as a means of preserving their sanity.  They are accepting files again as of today.

For me, this means I have two files uploaded to GedMatch, an earlier v2 file and now a later v3 file as well.  It will be interesting to see the differences between the matches to the two files.

In any case, if your results are v2 at 23andMe, you will have to retest to join the Family Tree DNA customer pool because the earlier 23andMe files can’t be used.

It’s relatively easy to tell whether your file is v2 or v3..  After downloading your file from 23andMe, if your zipped file is about 5K or smaller, it’s v2, while v3 files will be about 8K.  If you open the files and download them from Notepad to Excel, a v2 file will have about 575,000 rows in the spreadsheet, where the v3 file will have about 950,000.

Now that we’ve said all of that, we’re not even going to speculate about what the v4 chip that 23andMe is planning will do.  It’s not getting larger, it’s getting smaller again…so compatibility bets are off…that is….if there is a v4.  If 23andMe doesn’t get squared away with the FDA, it’s a moot point, which brings us back to why we were downloading our files in the first place.

9 thoughts on “Downloading and Uploading 23andMe Files – v2 vs v3

  1. Roberta: 23andMe also has V2-V3 files, These are files made when customers upgraded their V2 test to V3; The V2/3 files are slightly larger than straight V3 files (files made for new tests, not upgrades to V3). Originally (prior to a few updates) the SNP counts were V2 = 576K, V3 = 967K and V2 upgraded to V3 = 996K.

    Matt.

  2. I asked Ancestry about “autosomal transfer uploads” and they said no and didn’t say why not. But I want to know, I was on ISOGG’s autosomal DNA testing comparison chart,and saw on the 14th line,”SNP chip used for testing”, that both FamilyTreeDNA and AncestryDNA have the same chip, “Illumina OmniExpress”. That makes you wonder, I can see if Ancestry can’t do uploads because they have a different type of chip, but if Ancestry uses the same type of chip as FamilyTreeDNA, why can’t they do uploads? Is it because Ancestry just doesn’t want to do uploads? http://www.isogg.org/wiki/Autosomal_DNA_testing_comparison_chart

    • You’d have to ask Ancestry. They probably want to sell their test. They have been very slow to do any development of any new features, even the most basic genetic comparisons, something that both 23andMe and Family Tree DNA have had forever.

  3. Just when Gedmatch starts accepting uploads, FTDNA has stopped its raw data download. Hope this is only temporary, but strange to me that it has occurred!

  4. Roberta, Perhaps you can help me to understand the mechanics of uploading my 23andMe raw data into my account at FamilyTreeDNA.com . I can get to the payment pages and complete, but it is not evident as to how I should proceed in uploading my data after I pay for it. Because of this, I cancelled my credit card purchase and am now back to square one. Any advice you are able to provide will be very much appreciated. Thanks, Pat Borden FTMdna kit # 96766

    • Now that you’ve cancelled the purchase you can’t do anything without repurchasing. What you do next is simply to sign on to your page and one of the buttons on the front page says “upload data” after you purchase that product.

  5. I would like to upload my 23 and me file, but all I am reading are comments – noting definite. How do I do it?

  6. Pingback: DNAeXplain Archives – General Information Articles | DNAeXplained – Genetic Genealogy

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s