FamilyTreeDNA Thanksgiving Sale + New Comprehensive Health Report

FTDNA Thanksgiving.png

FamilyTreeDNA’s Thanksgiving Sale has begun. Almost everything is on sale. I don’t know about you, but I like to have all of my holiday planning and purchasing DONE before Thanksgiving. Some of the gifts I wanted for people this year are already sold out or backordered – but DNA testing is always available. The gift of history, and now of health too.

I wrote about the Big Y test and upgrades just a couple days ago, here, including the restructuring of the Big Y product resulting in a permanent $100 dollar reduction, in addition to sale prices.

FamilyTreeDNA has made a few product changes and introduced the new Tovana Health test.

I’ve included a special section of frequently asked questions (and answers) about tests and when upgrading does, and doesn’t, make sense.

Individual Tests

Let’s start with the sale prices for individual tests.

Test Sale Price Regular Price Savings
Family Finder (FF) $59 $79 $20
Y DNA 37 $99 $169 $70
Y DNA 111 *1 $199 $359 $160
Big Y-700 *2 $399 $649 $250
Mitochondrial Full Sequence *3 $139 $199 $60

*1 – You may notice that only the 37-marker and 111-marker tests are listed above. The 111-marker test was reduced to the 67-marker sale price, so, at least during the sale, the 67-marker test is not available. In other words, you get 111 markers for the price of 67.

*2 – The Big Y-700 test includes the Y 111 test plus another 589 STR markers (to equal or exceed 700 markers total) plus the SNP testing. You can read about the Big Y here.

*3 – The mitochondrial full sequence (FMS) aka mtFullSequence test is now the only mitochondrial DNA test available. I’m glad to see this change. The price of the mtFullSequence test has now dropped to the level of the less specific partial tests of yesteryear. Genealogists really need the granularity of the full test.

Bundles save even more – an additional $9 over purchasing the bundled items separately

Bundles

Test Sale Price Regular Price Savings
Family Finder + mtFullSequence $189 $278 $89
Family Finder + Y-37 $149 $248 $188
Family Finder + Y111 $249 $438 $189
Y-37 + mtFullSequence $229 $368 $139
Y-111 + mtFullSequence $329 $558 $229
Family Finder + Y-37 + mtFull $279 $447 $170
Family Finder + Y-111 + mtFull $379 $637 $258

When Does Upgrading Make Sense?

Y DNA Q&A

Q – If I have several Y DNA matches, will upgrading help?

A – If you need more specific or granular information to tease your line out of several matches – upgrading will help refine your matches and determine who is a closer match, assuming some of your matches have tested at a higher level.

Q – If I have tested at a lower level of STR markers and have no matches, will I have matches at a higher level?

A – Sometimes, but not usually. If your mutations just happen to fall in the lower panels, you may have matches on higher panels that allow for more mutations. If you do have matches on a higher test in this circumstance, the person may or may not have your surname. You can also join haplogroup and surname projects where thresholds are slightly lower for matching within projects.

If you don’t test, you’ll never know.

Q – If I have no matches on STR markers, meaning 12, 25, 37, 67 or 111, will upgrading to the Big Y be beneficial?

A – Possibly to probably – and here’s why, even if you don’t initially have matches:

  • The Big Y-700 provides multiple tools including matches at the SNP level, not just the STR level, so you are matched in two entirely different ways.
  • You may have same-surname matches at the SNP level that you do not have at the STR level which are further back in time, but still valuable and relevant to your family history.
  • You may have SNP matches that aren’t STR matches that are not your surname, but reflect your family history before the advent of surnames. These matches can tell you where your family came from before you can locate them in records. In fact, this is the ONLY way you can track your family before the advent of surnames.
  • Even if you don’t have matches, you’ll receive all of your SNP markers that allow you to view your results on the Block Tree, which is in essence a migration map back through time. You can read about the Block Tree here.
  • Your test contributes to building the phylotree – meaning the Y DNA tree of man – which benefits all genealogists. In just the first 10 months of 2019, 32,000 new SNPs have been placed on the tree, resulting in about 5,000 new individual branches. All because of Big Y-700
  • New people test every day and your DNA tests fish for you every minute of every day.

Mitochondrial DNA Q&A

If you’ve previously taken lower level mitochondrial HVR1 and HVR2 tests, now is the perfect time to upgrade.

Q – I have 5,000 <or fill in large number here> HVR1 level matches. Will upgrading reduce the number of matches to those that are more meaningful?

A – Absolutely! Your most genealogically relevant matches, meaning closest in time, are those that match you exactly at the full sequence level.

Q – I don’t know where my ancestor was from. Can a full sequence test help me?

A – Yes. You can use the Matches Map and see where the ancestors of your closest matches were from. That’s a huge hint. You can also utilize your haplogroup, which, in some instances, will point to a specific continent such as Africa, Europe, Asia or Native American and Jewish populations.

Q – If I have no matches at the HVR1 or HVR2 level? Will an upgrade help me?

A – Possibly. Both the HVR1 and HVR2 (now obsolete) tests only allowed for one mutation difference to be considered a match. The full sequence allows for many more differences. If you were unlucky and your mutations just happened to fall in the HVR1 or HVR2 levels, it would prevent a match which will occur at a higher level. Either way, you’ll receive information about your rare mutations – which may well explain why you don’t have matches (yet)! You’ll also receive a full haplogroup which will be useful, allowing you to use the mitochondrial haplotree to track back in time, which I wrote about here.

There are so many ways to obtain useful information. I wrote a step-by-step guide to using mitochondrial DNA, here.

Upgrade Options

Please note that if you are considering an upgrade, it maybe beneficial to upgrade to the maximum test available for either the Y or mitochondrial DNA, especially if you cannot obtain more of the sample. Of course, if it’s your own sample, you can always swab again, but others can’t.

Every time a vial is opened for testing, more DNA is used, until there is none left. Additionally, DNA degrades with time, depending on the quality of the original scraping and the amount of bacteria in the sample. Generally, the sample is viable for at least 5 years, but not always. Some older samples remain viable for many years. There’s no way to know in advance.

Test Sale Price Regular Price Savings
Y-12 to Y-37 $79 $109 $30
Y-12 to Y-67 $149 $199 $50
Y-12 to Y-111 $169 $359 $190
Y-25 to Y-37 $49 $59 $10
Y-25 to Y-67 $119 $159 $40
Y-25 to Y-111 $149 $269 $120
Y-37 to Y-67 $69 $109 $40
Y-37 to Y-111 $119 $228 $109
Y-67 to Y-111 $69 $99 $30
Y-12 to Big Y-700 $359 $629 $270
Y-25 to Big Y-700 $349 $599 $250
Y-37 to Big Y-700 $319 $569 $250
Y-67 to Big Y-700 $259 $499 $240
Y-111 to Big Y-700 $229 $499 $270
Big Y-500 to Big Y-700 $189 $249 $60
HVR1 to mtFullSequence $99 $159 $60
mtDNA Plus to mtFullSequence $99 $159 $60

Tovana – A New Limited Availability Exome Medical Report 

Recently, FamilyTreeDNA did a limited announcement about a medically supervised health exome health test for a subset of customers, specifically customers who:

  • Don’t live in Pennsylvania, New York, California or Maryland, due to state law restrictions.
  • Took the Family Finder test since October 2015 – meaning no transfers. The Family Finder test is used in conjunction with the exome chip to generate the customer report.

If you took the Family Finder test before October 2015, you are eligible but the rollout is being done in stages and your kit will be eligible in December.

This Tovana Genome Report is focused towards people who are health and wellness conscious. Meaning those who don’t want to die a premature death that might be preventable.

All genetic health tests focus on predispositions. You may or may not develop the condition, with a few notable exceptions, but forewarned is forearmed.

You might, however, be VERY interested in intervening, one way or another, BEFORE you develop potentially life-threatening conditions, or taking preventative actions to avoid developing those conditions. At the very least, you can be aware and monitor your health to catch them early, when they are treatable, manageable or potentially curable.

It only takes one, ONE, terrifying experience to convince you that health testing might make a difference.

Once you’re embroiled in that health nightmare, there is no going back in time to take a test and enact preventative measures.

My mother might still be with us had we known she was susceptible to blood clots. My sister had metastatic breast cancer.

Let me show you something from a Tovana report.

FTDNA Tovana.png

This portion of a page from an actual customer report shows this individual is positive for a mutation for a clotting disorder where clots are formed that can cause strokes, pulmonary embolisms and DVTs (deep vein thrombosis).

I’d give anything, any amount of money – to have had advance warning so we could have watched my mother more vigilantly and taken simple proactive measures that might have prevented her stroke and resulting death.

What would another 10 or 15 years with her have been worth?

We could have and would have discussed this with her doctors and asked about preventative measures, like taking aspirin or other measures as indicated by her health and other medications. (Please do not self-diagnose or medicate without discussing with your physician as drugs interact in ways patients may not be aware of.)

Compared to hospital (or funeral) bills, not to mention the sheer agony…the cost of this test at $799 is irrelevant. What better way to say, “I love you”?

I would pick up bottles by the side of the road, if I needed to, to be able to purchase this test for my Mom 15 years ago. Sadly, this type of testing wasn’t available then, but it is now.

Ignorance is not bliss.

I want to know if I or my children carry these predispositions so that we can take action.

The Tovana Test is Different

The Tovana test is different from and much more comprehensive than the tests offered by Ancestry, MyHeritage and 23andMe that utilize only your autosomal genealogy test.

To begin with, the Tovana test is run on an exome chip that tests over 50 million locations in addition to the 700,000+ locations tested in the Family Finder test.

The completed report that I viewed was 128 pages in length, with lots of graphics. This  explain explains autosomal dominant inheritance.

FTDNA Tovana autosomal dominant.png

The report is very user-friendly, including drawings, a risk-meter for polygenic conditions that involve more than a simple yes or no answer, explanations and recommendations for each condition reported.

FTDNA Tovana risk meter.png

And yes, in case you’re wondering, the report also includes the fun traits like ear wax and such that you can discuss if you’re bored beyond imagination at a cocktail party.

Each report is centered around and tailored to the family information you provide, such as known Jewish heritage, or known cases of cancer.

FTDNA Tovana Table of Contents.png

Comparisons

I’ve compiled a chart with some comparison details – although this test is in a class by itself where the other three tests compete directly with each other.

I’ve personally taken the other tests, except for the Ancestry upgrade. I also took an early exome test a few years ago, but THAT ONE CAME WITH NO REPORT OR EXPLANATIONS.

  23andme Ancestry Health Core MyHeritage Family Tree DNA Tovana Test
# DNA locations tested About 700,000 About 700,000 About 700,000 >50 million plus the 700,000 in the Family Finder test
# Results Provided to Customer 78 health + polygenic diabetes +34 traits such as freckles 84 88+ polygenic heart, diabetes, breast cancer 3000+ including many polygenic diseases including heart, diabetes & 35 genes associated with breast cancer
Physician Oversight No PWNHealth PWNHealth Tovana
Personal Clinical Analysis No No No Yes
Analysis, Interpretation by board certified geneticist No No No Yes
Genetic Counseling No Yes, limited Yes, limited $50 for 30-minute session
Updates Yes, episodic depending on test level, may not receive, sometimes have to purchase new test No, one time results only Yes, free for first year then with $99 per year subscription Not at this time, but under consideration
Cost – Initial Purchase $149 upgrade only after DNA test $199 new purchase -combined health plus ancestry $799 introductory price
Upgrade if Already Tested No $49 upgrade if have already tested $120 to upgrade if already tested, plus $99 year subscription after year 1 Not relevant
Requirements None This is an upgrade from an existing Ancestry test Must test with MyHeritage, not a transfer kit

Are You Eligible?

To see if you are one of the customers eligible to purchase the Tovane Genome Report, sign in, here, and then check your personal page under “Additional Features” to see if the Tovana Genome Report is available. If so, click for more information or to order.

FTDNA Tovana order.png

You’ve probably guessed what my family is receiving for Christmas😊. No one else is going to suffer from or die from something preventable if I can help it.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Big Y News and Stats + Sale

I must admit – this past January when FamilyTreeDNA announced the Big Y-700, an upgrade from the Big Y-500 product, I was skeptical. I wondered how much benefit testers would really see – but I was game to purchase a couple upgrades – and I did. Then, when the results came back, I purchased more!

I’m very pleased to announce that I’m no longer skeptical. I’m a believer.

The Big Y-700 has produced amazing results – and now FamilyTreeDNA has decoupled the price of the BAM file in addition to announcing substantial sale prices for their Thanksgiving Sale.

I’m going to discuss sale pricing for products other than the Big Y in a separate article because I’d like to focus on the progress that has been made on the phylogenetic tree (and in my own family history) as a result of the Big Y-700 this year.

Big Y Pricing Structure Change

FamilyTreeDNA recently anounced some product structure changes.

The Big Y-700 price has been permanently dropped by $100 by decoupling the BAM file download from the price of the test itself. This accomplishes multiple things:

  • The majority of testers don’t want or need the BAM file, so the price of the test has been dropped by $100 permanently in order to be able to price the Big Y-700 more attractively to encourage more testers. That’s good for all of us!!!
  • For people who ordered the Big Y-700 since November 1, 2019 (when the sale prices began) who do want the BAM file, they can purchase the BAM file separately through the “Add Ons and Upgrades” page, via the “Upgrades” tab for $100 after their test results are returned. There will also be a link on the Big Y-700 results page. The total net price for those testers is exactly the same, but it represents a $100 permanent price drop for everyone else.
  • This BAM file decoupling reduces the initial cost of the Big Y-700 test itself, and everyone still has the option of purchasing the BAM file later, which will make the Big Y-700 test more affordable. Additionally, it allows the tester who wants the BAM file to divide the purchase into two pieces, which will help as well.
  • The current sale price for the Big Y-700 for the tester who has taken NO PREVIOUS Y DNA testing is now just $399, formerly $649. That’s an amazing price drop, about 40%, in the 9 months since the Big Y-700 was introduced!
  • Upgrade pricing is available too, further down in this article.
  • If you order an upgrade from any earlier Big Y to the Big Y-700, you receive an upgraded BAM file because you already paid for the BAM file when you ordered your initial Big Y test.
  • The VCF file is still available for download at no additional cost with any Big Y test.
  • There is no change in the BAM file availability for current customers. Everyone who ordered before November 1, 2019 will be able to download their BAM file as always.

The above changes are permanent, except for the sale price.

2019 has been a Banner Year

I know how successful the Big Y-700 has been for kits and projects that I manage, but how successful has it been overall, in a scientific sense?

I asked FamilyTreeDNA for some stats about the number of SNPs discovered and the number of branches added to the Y phylotree.

Drum roll please…

Branches Added This Year Total Tree Branches Variants Added to Tree This Year Total Variants Added to Tree
2018 6,259 17,958 60,468 132.634
2019 4,394 22.352 32,193 164,827

The tests completed in 2019 are only representative for 10 months, through October, and not the entire year.

Haplotree Branches

Not every SNP discovered results in a new branch being added to the haplotree, but many do. This chart shows the number of actual branches added in 2018 and 2019 to date.

Big Y 700 haplotree branches.png

These stats, provided by FamilyTreeDNA, show the totals in the bottom row, which is a cumulative branch number total, not a monthly total. At the end of October 2019, the total number of individual branches were 22,352.

Big Y 700 haplotree branches small.png

This chart, above, shows some of the smaller haplogroups.

Big Y 700 haplotree branches large.png

This chart shows the larger haplogroups, including massive haplogroup R.

Haplotree Variants

The number of variants listed below is the number of SNPs that have been discovered, named and placed on the tree. You’ll notice that these numbers are a lot larger than the number of branches, above. That’s because roughly 168,000 of these are equivalent SNPs, meaning they don’t further branch the tree – at least not yet. These 168K variants are the candidates to be new branches as more people test and the tree can be further split.

Big Y 700 variants.png

These numbers also don’t include Private Variants, meaning SNPs that have not yet been named.

If you see Private Variants listed in your Big Y results, when enough people have tested positive for the same variant, and it makes sense, the variants will be given a SNP name and placed on the tree.

Big Y 700 variants small.png

The smaller haplogroups variants again, above, followed by the larger, below.

Big Y 700 variants large.png

Upgrades from the Big Y, or Big Y-500 to Big Y-700

Based on what I see in projects, roughly one third of the Big Y and Big Y-500 tests have upgraded to the Big Y-700.

For my Estes line, I wondered how much value the Big Y-700 upgrade would convey, if any, but I’m extremely glad I upgraded several kits. As a result of the Big Y-700, we’ve further divided the sons of Abraham, born in 1747. This granularity wasn’t accomplished by STR testing and wasn’t accomplished by the Big Y or Big Y-500 testing alone – although all of these together are building blocks. I’m ECSTATIC since it’s my own ancestral line that has the new lineage defining SNP.

Big Y 700 Estes.png

Every Estes man descended from Robert born in 1555 has R-BY482.

The sons of the immigrant, Abraham, through his father, Silvester, all have BY490, but the descendants of Silvester’s brother, Robert, do not.

Moses, son of Abraham has ZS3700, but the rest of Abraham’s sons don’t.

Then, someplace in the line of kit 831469, between Moses born in 1711 and the present-day tester, we find a new SNP, BY154784.

Big Y 700 Estes block tree.png

Looking at the block tree, we see the various SNPs that are entirely Estes, except for one gentleman who does not carry the Estes surname. I wrote about the Block Tree, here.

Without Big Y testing, none of these SNPs would have been found, meaning we could never have split these lines genealogically.

Every kit I’ve reviewed carries SNPs that the Big Y-700 has been able to discern that weren’t discovered previously.

Every. Single. One.

Now, even someone who hasn’t tested Y DNA before can get the whole enchilada – meaning 700+ STRs, testing for all previously discovered SNPs, and new branch defining SNPs, like my Estes men – for $399.

If a new Estes tester takes this test, without knowing anything about his genealogy, I can tell him a great deal about where to look for his lineage in the Estes tree.

Reduced Prices

FamilyTreeDNA has made purchasing the Big Y-700 outright, or upgrading, EXTREMELY attractive.

Test Price
Big Y-700 purchase with no previous Y DNA test

 

$399
Y-12 upgrade to Big Y-700 $359
Y-25 upgrade to Big Y-700 $349
Y-37 upgrade to Big Y-700 $319
Y-67 upgrade to Big Y-700 $259
Y-111 upgrade to Big Y-700 $229
Big Y or Big Y-500 upgrade to Big Y-700 $189

Note that the upgrades include all of the STR markers as yet untested. For example, the 12-marker to Big Y-700 includes all of the STRs between 25 and 111, in addition to the Big Y-700 itself. The Big Y-700 includes:

  • All of the already discovered SNPs, called Named Variants, extending your haplogroup all the way to the leaf at the end of your branch
  • Personal and previously undiscovered SNPs called Private Variants
  • All of the untested STR markers inclusive through 111 markers
  • A minimum of a total of 700 STR markers, including markers above 111 that are only available through Big Y-700 testing

With the refinements in the Big Y test over the past few years, and months, the Big Y is increasingly important to genealogy – equally or more so than traditional STR testing. In part, because SNPs are not prone to back mutations, and are therefore more stable than STR markers. Taken together, STRs and SNPs are extremely informative, helping to break down ancestral brick walls for people whose genealogy may not reach far back in time – and even those who do.

If you are a male and have not Y DNA tested, there’s never been a better opportunity. If you are a female, find a male on a brick wall line and sponsor a scholarship.

Click here to order or upgrade!

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Big Y-500 STR Matching

Family Tree DNA recently introduced Big Y-500 STR matching for men who have taken  the Big Y-500 test. This is in addition to the SNP results and matching. If you’d like an introduction or definition of the terms STR and SNP, you can read about SNPs and STRs here.

Beginning in April 2018, Family Tree DNA included an additional 379+ STR markers for free for Big Y testers as a bonus, meaning for free, including all earlier testers.

While the Big Y-500 STR marker values have been included in customers’ results for several months, unless you contacted your matches directly, you didn’t know how many of those additional markers above 111 you matched on – until now.

If you haven’t taken the Big Y test, the article Why the Big Y Test? will explain why you might want to. In addition to the Big Y results, which refine your haplogroup and scan the entire gold standard region of the Y chromosome looking for SNPs, you’ll also receive at least 389 Y STR markers above the 111 STR panel for total of at least 500, for free – which is why the name of the Big Y test was changed to the Big Y-500. If you haven’t tested at the 111 marker level, don’t worry about that because the cost of the upgrade is bundled in the price of the Big Y-500 test. Click here to sign in to your account and then click on the blue upgrade button to view pricing.

Big Y-500 STR Matching

To view your matches and values above the traditional 111 makers, sign on to your account and click on Y DNA matches.

You’ll see the following display.

Y500 matches

The column “Big Y-500 STR Differences” is new. If you have not taken the Big Y-500 test, you won’t see this column.

If you have taken the Big Y-500, you’ll see results for any other man that you match who has taken the Big Y-500 test. In this example, 5 of this person’s matches have also taken the Big Y-500 test.

What Are Big Y-500 STR Differences?

The “Big Y-500 STR Differences” column values are expressed in the format “4 of 441” or something similar.

The first number represents the number of non-matching locations you have above 111 markers – in this case, 4. In the csv download file, this value is displayed in the “Big Y-500 Differences” column.

The second number represents the total number of markers above 111 that have a value for both of you – in this case, 441. In other words, you and the other man are being compared on 441 marker locations. In the csv download file, this value is displayed in the “Big Y-500 Compared” column.

Because the markers above 111 are processed using NGS (next generation sequencing) scan technology, virtually every kit will have some marker locations that have no-calls, meaning the test doesn’t read reliably at that location in spite of being scanned several times.

It’s more difficult to read STRs accurately using NGS scan technology, as compared to SNPs. SNPs are only one position in length, so only one position needs to be read correctly. STRs are repeated of a sequence of nucleotides. A 20 repeat sequence could consist of 20 copies of a series of 4 nucleotides, so a total of 80 positions in a row would need to be successfully read several times.

Let’s take a look at how matching works.

How Does Big Y-500 STR Matching Work?

If you have a total of 441 markers that read reliably, but your match has a total of 439 that produced results, the maximum number of markers possible to share would be 439. If you both have no calls on different marker locations, you would match on fewer than 439 locations. Here’s an example just using 9 fictitious markers.

Y500 match example

Based on the example above, we can see that the red cells can’t match because they experienced no-calls, and the yellow cells do have results, but don’t match.

Y500 summary

New Filter

There’s also a new filter option so you can view only matches that have taken the Big Y-500 test.

Y500 filter

Let’s look at some of the questions people have been asking.

Frequently Asked Questions

Question 1: Are the markers above 111 taken into account in the Genetic Distance column?

Answer: No, the values calculated in the genetic distance column are the number of mismatches for the marker level you are viewing using a combination of the step-wise and infinite alleles mutation models. (Stay with me here.)

In our example, we’re viewing the 111 marker level, so the genetic distance tells you the number of mismatches at 111 markers. If we were viewing the 67 marker level, then the genetic distance would be for 67 markers.

The number of mismatches above 111 markers shows separately in the “Big Y-500 STR Differences” column and is calculated using the infinite alleles model, meaning every mutation is counted as one difference. You can read more about genetic distance in the article, Concepts – Genetic Distance.

The good news is that you don’t need to calculate anything, but you may want to understand how the markers are scored and how the genetic distance is calculated. If so, go ahead and read question 2. If not, skip to question 3.

Question 2: What’s the difference between the step-wise model and the infinite alleles model?

Answer: The step-wise model assumes that a mutated value on a particular marker of multiple steps, meaning a difference between a 28 for one man and a 30 for another is a result of two separate mutation events that happened at different times, so counted as 2 mutations, 2 steps, so a genetic distance of 2.

However, this doesn’t work well with palindromic markers, explained here, where multi-copy markers, such as DYS464, often mutate more than one step at a time.

Counting multiple mathematical differences as only one mutation event is called the infinite alleles model. For example, a dual copy marker that has a value of 15-16 could mutate to 15-18 in one step and would be counted as one mutation event, and one difference and a genetic distance of one using the infinite alleles model. The same event would count as 2 mutation events (steps) and a genetic distance of 2 using the step-wise mutation model. In this article, I explain which markers are calculated using which methodology.

Another good infinite alleles example is when a location loses it’s DNA at a marker entirely. If the marker value for most men being compared is 10 and is being compared to a  person with no DNA at that location, resulting in a null value of 0 (which is not the same as a no-call which means the location couldn’t be read successfully), the mutation event happened in one step, and the difference should be counted as one event, one step and a genetic distance of one, not 10 events, 10 steps and a genetic distance of 10.

To recap, the values of markers 1-111 are calculated by a combination of the step-wise model and the infinite alleles model, depending on the marker number and situation. The differences in markers above 111 are calculated using the infinite alleles model where every mutation or difference equals a distance of one unless a zero (null) is encountered. In that case, the mutation event is considered a one. However, above 111 markers, using NGS technology, most instances where no DNA is encountered results in a no-read, not a null value.

Question 3: Has the TIP calculator been updated?

Answer: No, the TIP calculator does not take into account the new markers above 111. The TIP calculator relies upon the combined statistical mutation frequency for each marker and includes haplogroup differences. Therefore, it would be difficult to compensate for different numbers of markers, with various markers missing for each individual above 111 markers. The TIP calculator only utilizes markers 1-111.

Question 4: Do projects display more than 111 markers?

Answer: No, projects don’t display the additional markers, at least not yet. The 111 marker results require scrolling to the right significantly, and 500 markers would require 5 times as much scrolling to compare values. Anyone with an idea how to better accomplish a public project display/comparison should submit their idea to Family Tree DNA.

Question 5: Which markers above 111 are fast versus slow mutating?

Answer: Results for these markers are new and statistical compilations aren’t yet available. However, initial results for surname projects in which several men who share a surname and match have tested indicate that there’s not as much variation in these additional markers as we’ve seen in the previous 111 markers, meaning Family Tree DNA already selected the most informative genealogical markers initially. This suggests that the additional markers may provide additional mutations but probably not five times as many as the initial 111 markers.

Question 6: Why do I have more mutations in the first 111 markers than I do in the 389+ markers above the 111 panel?

Answer: That’s a really good question. You’ve probably noticed in our example that the men have dis-proportionally more mutations in the first 111 markers than in the markers above 111.

Y500 genetic distance

The trend is clearly for the first 111 markers to mutate more frequently than the 379+ markers above 111. This means that the first 111 markers are generally going to be more genealogically informative than the balance of the 379+ markers. However, and this is a big however, if the line marker mutation that you need to sort out your group of men occurs in the markers above 111, the number of mutations and the percentages don’t mean anything at all. The information that matters is how you can utilize these markers to differentiate men within the line you are working with, and what story those markers tell.

Of course, the markers above 111 are free as part of the Big Y-500 test which is designed to extract as much SNP information as possible. In essence, these STR markers are icing on the cake – a treat we never expected.

Bottom Line

Here’s the bottom line about the Big-Y 500 STR markers. You don’t know what you don’t know and these 379+ STR markers come along with the Big Y test as a bonus. If you’re looking for line-marker STR mutations in groups of men, the Big Y-500 is a logical next step after 111 marker testing.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some (but not all) of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Whole Genome Sequencing – Is It Ready for Prime Time?

Dante Labs is offering a whole genomes test for $199 this week as an early Black Friday special.

Please note that just as I was getting ready to push the publish button on this article, Veritas Genetics also jumped on the whole sequencing bandwagon for $199 for the first 1000 testers Nov. 19 and 20th. In this article, I discuss the Dante Labs test. I have NOT reviewed Veritas, their test nor terms, so the same cautions discussed below apply to them and any other company offering whole genome sequencing. The Veritas link is here.

Update – Veritas provides the VCF file for an additional $99, but does not provide FASTQ or BAM files, per their Tweet to me.

I have no affiliation with either company.

$199 (US) is actually a great price for a whole genome test, but before you click and purchase, there are some things you need to know about whole genome sequencing (WGS) and what it can and can’t do for you. Or maybe better stated, what you’ll have to do with your own results before you can utilize the information for genealogical purposes.

The four questions you need to ask yourself are:

  • Why do you want to consider whole genome testing?
  • What question(s) are you trying to answer?
  • What information do you seek?
  • What is your testing goal?

I’m going to say this once now, and I’ll say it again at the end of the article.

Whole genome sequencing tests are NOT A REPLACEMENT FOR GENEALOGICAL DNA TESTS for mitochondrial, Y or autosomal testing. Whole genome sequencing is not a genealogy magic bullet.

There are both pros and cons of this type of purchase, as with most everything. Whole genome tests are for the most experienced and technically savvy genetic genealogists who understand both working with genetics and this field well, who have already taken the vendors’ genealogy tests and are already in the Y, mitochondrial and autosomal comparison data bases.

If that’s you or you’re interested in medical information, you might want to consider a whole genome test.

Let’s start with some basics.

What Is Whole Genome Sequencing?

Whole Genome Sequencing will sequence most of your genome. Keep in mind that humans are more than 99% identical, so the only portions that you’ll care about either medically or genealogically are the portions that differ or tend to mutate. Comparing regions where you match everyone else tells you exactly nothing at all.

Exome Sequencing – A Subset of Whole Genome

Exome sequencing, a subset of whole genome sequencing is utilized for medical testing. The Exome is the region identified as the portions most likely to mutate and that hold medically relevant information. You can read about the benefits and challenges of exome testing here.

I have had my Exome sequenced twice, once at Helix and once at Genos, now owned by NantOmics. Currently, NantOmics does not have a customer sign-in and has acquired my DNA sequence as part of the absorption of Genos. I’ll be writing about that separately. There is always some level of consumer risk in dealing with a startup.

Helix sequences your Exome (plus) so that you can order a variety of DNA based or personally themed products from their marketplace, although I’m not convinced about the utility of even the legitimacy of some of the available tests, such as the “Wine Explorer.”

On the other hand, the world-class The National Geographic Society’s Genographic Project now utilizes Helix for their testing, as does Spencer Well’s company, Insitome.

You can also pay to download your Exome sequence data separately for $499.

Autosomal Testing for Genealogy

Both whole genome and Exome testing are autosomal testing, meaning that they test chromosomes 1-22 (as opposed to Y and mitochondrial DNA) but the number of autosomal locations varies vastly between the various types of tests.

The locations selected by the genealogy testing companies are a subset of both the whole genome and the Exome. The different vendors that compare your DNA for genealogy generally utilize between 600,000 and 900,000 chip-specific locations that they have selected as being inclined to mutate – meaning that we can obtain genealogically relevant information from those mutations.

Some vendors (for example, 23andMe and Ancestry) also include some medical SNPs (single nucleotide polymorphisms) on their chips, as both have formed medical research alliances with various companies.

Whole genome and Exome sequencing includes these same locations, BUT, the whole genome providers don’t compare the files to other testers nor reduce the files to the locations useful for genealogical comparisons. In other words, they don’t create upload files for you.

The following chart is not to scale, but is meant to convey the concept that the Exome is a subset of the whole genome, and the autosomal vendors’ selected SNPs, although not the same between the companies, are all subsets of the Exome and full genome.

I have not had my whole genome sequenced because I have seen no purpose for doing so, outside of curiosity.

This is NOT to imply that you shouldn’t. However, here are some things to think about.

Whole Genome Sequencing Questions

Coverage – Medical grade coverage is considered to be 30X, meaning an average of 30 scans of every targeted location in your genome. Some will have more and some will have less. This means that your DNA is scanned thirty different times to minimize errors. If a read error happens once or twice, it’s unlikely that the same error will happen several more times. You can read about coverage here and here.

Genomics Education Programme [CC BY 2.0 (https://creativecommons.org/licenses/by/2.

Here’s an example where the read length of Read 1 is 18, and the depth of the location shown in light blue is 4, meaning 4 actual reads were obtained. If the goal was 30X, then this result would be very poor. If the goal was 4X then this location is a high quality result for a 4X read.

In the above example, if the reference value, meaning the value at the light blue location for most people is T, then 4 instances of a T means you don’t have a mutation. On the other hand, if T is not the reference value, then 4 instances of T means that a mutation has occurred in that location.

Dante Labs coverage information is provided from their webpage as follows:

Other vendors coverage values will differ, but you should always know what you are purchasing.

Ownership – Who owns your data? What happens to your DNA itself (the sample) and results (the files) under normal circumstances and if the company is sold. Typically, the assets of the company, meaning your information, are included during any acquisition.

Does the company “share, lease or sell” your information as an additional revenue stream with other entities? If so, do they ask your permission each and every time? Do they perform internal medical research and then sell the results? What, if anything, is your DNA going to be used for other than the purpose for which you purchased the test? What control do you exercise over that usage?

Read the terms and conditions carefully for every vendor before purchasing.

File Delivery – Three types of files are generated during a whole genome test.

The VCF (Variant Call Format) which details your locations that are different from the reference file. A reference file is the “normal” value for humans.

A FASTQ file which includes the nucleotide sequence along with a corresponding quality score. Mutations in a messy area or that are not consistent may not be “real” and are considered false positives.

The BAM (Binary Alignment Map) file is used for Y DNA SNP alignment. The output from a BAM file is displayed in Family Tree DNA’s Big Y browser for their customers. Are these files delivered to you? If so, how? Family Tree DNA delivers their Big Y DNA BAM files as free downloads.

Typically whole genome data is too large for a download, so it is sent on a disc drive to you. Dante provides this disc for BAM and FASTQ files for 59 Euro ($69 US) plus shipping. VCF files are available free, but if you’re going to order this product, it would be a shame not to receive everything available.

Version – Discoveries are still being made to the human genome. If you thought we’re all done with that, we’re not. As new regions are mapped successfully, the addresses for the rest change, and a new genomic map is created. Think of this as street addresses and a new cluster of houses is now inserted between existing houses. All of the houses are periodically renumbered.

Today, typically results are delivered in either of two versions: hg19(GRVH37) or hg38(GRCH38). What happens when the next hg (human genome) version is released?

When you test with a vendor who uses your data for comparison as a part of a product they offer, they must realign your data so that the comparison will work for all of their customers (think Family Tree DNA and GedMatch, for example), but a vendor who only offers the testing service has no motivation to realign your output file for you. You only pay for sequencing, not for any after-the-fact services.

Platform – Multiple sequencing platforms are available, and not all platforms are entirely compatible with other competing platforms. For example, the Illumina platform and chips may or may not be compatible with the Affymetrix platform (now Thermo Fisher) and chips. Ask about chip compatibility if you have a specific usage in mind before you purchase.

Location – Where is your DNA actually being sequenced? Are you comfortable having your DNA sent to that geographic location for processing? I’m personally fine with anyplace in either the US, Canada or most of Europe, but other locations maybe not so much. I’d have to evaluate the privacy policies, applicable laws, non-citizen recourse and track record of those countries.

Last but perhaps most important, what do you want to DO with this file/information?

Utilization

What you receive from whole genome sequencing is files. What are you going to do with those files? How can you use them? What is your purpose or goal? How technically skilled are you, and how well do you understand what needs to be done to utilize those files?

A Specific Medical Question

If you have a particular question about a specific medical location, Dante allows you to ask the question as soon as you purchase, but you must know what question to ask as they note below.

You can click on their link to view their report on genetic diseases, but keep in mind, this is the disease you specifically ask about. You will very likely NOT be able to interpret this report without a genetic counselor or physician specializing in this field.

Take a look at both sample reports, here.

Health and Wellness in General

The Dante Labs Health and Wellness Report appears to be a collaborative effort with Sequencing.com and also appears to be included in the purchase price.

I uploaded both my Exome and my autosomal DNA results from the various testing companies (23andMe V3 and V4, Ancestry V1 and V2, Family Tree DNA, LivingDNA, DNA.Land) to Promethease for evaluation and there was very little difference between the health-related information returned based on my Exome data and the autosomal testing vendors. The difference is, of course, that the Exome coverage is much deeper (and therefore more reliable) because that test is a medical test, not a consumer genealogy test and more locations are covered. Whole genome testing would be more complete.

I wrote about Promethease here and here. Promethease does accept VCF files from various vendors who provide whole genome testing.

None of these tests are designed or meant for medical interpretation by non-professionals.

Medical Testing

If you plan to test with the idea that should your physician need a genetics test, you’re already ahead of the curve, don’t be so sure. It’s likely that your physician will want a genetics test using the latest technology, from their own lab, where they understand the quality measures in place as well as how the data is presented to them. They are unlikely to accept a test from any other source. I know, because I’ve already had this experience.

Genealogical Comparisons

The power of DNA testing for genealogy is comparing your data to others. Testing in isolation is not useful.

Mitochondrial DNA – I can’t tell for sure based on the sample reports, but it appears that you receive your full sequence haplogroup and probably your mutations as well from Dante. They don’t say which version of mitochondrial DNA they utilize.

However, without the ability to compare to other testers in a database, what genealogical benefit can you derive from this information?

Furthermore, mitochondrial DNA also has “versions,” and converting from an older to a newer version is anything but trivial. Haplogroups are renamed and branches sawed from one part of the mitochondrial haplotree and grafted onto another. A testing (only) vendor that does not provide comparisons has absolutely no reason to update your results and can’t be expected to do so. V17 is the current build, released in February 2016, with the earlier version history here.

Family Tree DNA is the only vendor who tests your full sequence mitochondrial DNA, compares it to other testers and updates your results when a new version is released. You can read more about this process, here and how to work with mtDNA results here.

Y DNA – Dante Labs provides BAM files, but other whole genome sequencers may not. Check before you purchase if you are interested in Y DNA. Again, you’ll need to be able to analyze the results and submit them for comparison. If you are not capable of doing that, you’ll need to pay a third party like either YFull or FGS (Full Genome Sequencing) or take the Big Y test at Family Tree DNA who has the largest Y Database worldwide and compares results.

Typically whole genome testers are looking for Y DNA SNPs, not STR values in BAM files. STR (short tandem repeat) values are the results that you receive when you purchase the 37, 67 or 111 tests at Family Tree DNA, as compared to the Big Y test which provides you with SNPs in order to resolve your haplogroup at the most granular level possible. You can read about the difference between SNPs and STRs here.

As with SNP data, you’ll need outside assistance to extract your STR information from the whole genome sequence information, none of which will be able to be compared with the testers in the Family Tree DNA data base. There is also an issue of copy-count standardization between vendors.

You can read about how to work with STR results and matches here and Big Y results here.

Autosomal DNA – None of the major providers that accept transfers (MyHeritage, Family Tree DNA, GedMatch) accept whole genome files. You would need to find a methodology of reducing the files from the whole genome to the autosomal SNPs accepted by the various vendors. If the vendors adopt the digital signature technology recently proposed in this paper by Yaniv Erlich et al to prevent “spoofed files,” modified files won’t be accepted by vendors.

Summary

Whole genome testing, in general, will and won’t provide you with the following:

Desired Feature Whole Genome Testing
Mitochondrial DNA Presumed full haplogroup and mutations provided, but no ability for comparison to other testers. Upload to Family Tree DNA, the only vendor doing comparisons not available.
Y DNA Presume Y chromosome mostly covered, but limited ability for comparison to other testers for either SNPs or STRs. Must utilize either YFull or FGS for SNP/STR analysis. Upload to Family Tree DNA, the vendor with the largest data base not available when testing elsewhere.
Autosomal DNA for genealogy Presume all SNPs covered, but file output needs to be reduced to SNPs offered/processed by vendors accepting transfers (Family Tree DNA, MyHeritage, GedMatch) and converted to their file formats. Modified files may not be accepted in the future.
Medical (consumer interest) Accuracy is a factor of targeted coverage rate and depth of actual reads. Whole genome vendors may or may not provide any analysis or reports. Dante does but for limited number of conditions. Promethease accepts VCF files from vendors and provides more.
Medical (physician accepted) Physician is likely to order a medical genetics test through their own institution. Physicians may not be willing to risk a misdiagnosis due to a factor outside of their control such as an incompatible human genome version.
Files VCF, FASTQ and BAM may or may not be included with results, and may or may not be free.
Coverage Coverage and depth may or may not be adequate. Multiple extractions (from multiple samples) may or may not be included with the initial purchase (if needed) or may be limited. Ask.
Updates Vendors who offer sequencing as a part of a products that include comparison to other testers will update your results version to the current reference version, such as hg38 and mitochondrial V17. Others do not, nor can they be expected to provide that service.
Version Inquire as to the human genome (hg) version or versions available to you, and which version(s) are acceptable to the third party vendors you wish to utilize. When the next version of the human genome is released, your file will no longer be compatible because WGS vendors are offering sequencing only, not results comparisons to databases for genealogy.
Ownership/Usage Who owns your sample? What will it be utilized for, other than the service you ordered, by whom and for what purposes? Will you we able to authorize or decline each usage?
Location Where geographically is your DNA actually being sequenced and stored? What happens to your actual DNA sample itself and the resulting files? This may not be the location where you return your swab kit.

The Question – Will I Order?

The bottom line is that if you are a genealogist, seeking genetic information for genealogical purposes, you’re much better off to test with the standard and well know genealogy vendors who offer compatibility and comparisons to other testers.

If you are a pioneer in this field, have the technical ability required to make use of a whole genome test and are willing to push the envelope, then perhaps whole genome sequencing is for you.

I am considering ordering the Dante Labs whole genome test out of simple curiosity and to upload to Promethease to determine if the whole genome test provides me with something potentially medically relevant (positive or negative) that autosomal and Exome testing did not.

I’m truly undecided. Somehow, I’m having trouble parting with the $199 plus $69 (hard drive delivery by request when ordering) plus shipping for this limited functionality. If I was a novice genetic genealogist or was not a technology expert, I would definitely NOT order this test for the reasons mentioned above.

A whole genome test is not in any way a genealogical replacement for a full sequence mitochondrial test, a Y STR test, a Y SNP test or an autosomal test along with respective comparison(s) in the data bases of vendors who don’t allow uploads for these various functions.

The simple fact that 30X whole genome testing is available for $199 plus $69 plus shipping is amazing, given that 15 years ago that same test cost 2.7 billion dollars. However, it’s still not the magic bullet for genealogy – at least, not yet.

Today, the necessary integration simply doesn’t exist. You pay the genealogy vendors not just for the basic sequencing, but for the additional matching and maintenance of their data bases, not to mention the upgrading of your sequence as needed over time.

If I had to choose between spending the money for the WGS test or taking the genealogy tests, hands down, I’d take the genealogy tests because of the comparisons available. Comparison and collaboration is absolutely crucial for genealogy. A raw data file buys me nothing genealogically.

If I had not previously taken an Exome test, I would order this test in order to obtain the free Dante Health and Wellness Report which provides limited reporting and to upload my raw data file to Promethease. The price is certainly right.

However, keep in mind that once you view health information, you cannot un-see it, so be sure you do really want to know.

What do you plan to do? Are you going to order a whole genome test?

______________________________________________________________

Disclosure

I receive a small contribution when you click on some (but not all) of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Family Tree DNA’s PUBLIC Y DNA Haplotree

It’s well known that as a result of Big Y testing that Family Tree DNA has amassed a huge library of Y DNA full sequence results that have revealed new SNPs, meaning new haplotree branches, for testers. That’s how the Y haplotree is built. I wrote about this in the article, Family Tree DNA Names 100,000 New Y DNA SNPs.

Up until now, the tree was only available on each tester’s personal pages, but that’s not the case anymore.

Share the Wealth

Today, Family Tree DNA has made the tree public. Thank you, thank you, THANK YOU Family Tree DNA.

To access the tree, click here, but DON’T sign in. Scroll to the bottom of the page. Keep scrolling, and scrolling…until you see the link under Community that says “Y-DNA Haplotree.” Click there.

The New Public Haplotree

The new public haplotree is amazing.

This tree isn’t just for people who took the Big Y test, but includes anyone who has a haplogroup confirming SNP OR took the Big Y test. Predicted haplogroups, of course, aren’t included.

Each branch includes the location of the most recent known ancestor of individuals who carry that terminal SNP, shown with a flag.

The branches are color coded by the following:

  • Light blue = haplogroup root branches
  • Teal or blue/green = branches with no descendants
  • Dark blue = branches that aren’t roots and that do have at least one descendant branch

The flag location is determined by the most distant known ancestor, so if you don’t have a “Most Distant Known Ancestor” completed, with a location, please, please, complete that field by clicking on “Manage Personal Information” beneath your profile picture on your personal page, then on Genealogy, shown below. Be sure to click on Save when you’re finished!

View Haplotree By

Viewing the haplotree is not the same as searching. “View by” is how the tree is displayed.

Click on the “View By” link to display the options: country, surnames or variant.

You can view by the country (flags), which is the default, the surname or the variants.

Country view, with the flags, is the default. Surname view is shown below.

The third view is variant view. By the way, a variant is another word for SNP. For haplogroup R-M207, there are 8,202 variants, meaning SNPs occurring beneath, or branches.

Reports

On any of the branch links, you’ll see three dots at the far right.

To view reports by country or surname, click on the dots to view the menu, then click on the option you desire.

Country statistics above, surname below. How cool is this!

Searching

The search function is dependent on the view currently selected. If you are in the surname view, then the search function says “Search by Surname” which allows you to enter a surname. I entered Estes.

If I’m not currently on the haplogroup R link, the system tells me that there are 2 Estes results on R. If I’m on the R link, the system just tells me how many results it found for that surname on this branch and if there are others on other branches.

The tree then displays the direct path between R-M207 (haplogroup R root) and the Estes branch.

…lots of branches in-between…

The great thing about this is that I can now see the surnames directly above my ancestral surname, if they meet the criteria to be displayed.

Display criteria is that two people match on the same branch AND that they both have selected public sharing. Requiring two surnames per branch confirms that result.

If you want to look at a specific variant, you can enter that variant name (BY490) in the search box and see the surnames associated with the variant. The click on “View by” to change the view from country (maps) to surnames to variants.

Change from country to surname.

And from surname to variants.

What geeky fun!!!

Go to Branch Name

If you want to research a specific branch, you can go there directly by utilizing the “Go to Branch Name” function, but you must enter the haplogroup in front of the branch name. R-BY490 for example.

When you’re finished with this search, REMOVE THE BRANCH NAME from the search box, if you’re going to do any other searches, or the system thinks you’re searching within that branch name.

My Result Isn’t Showing

In order for your results to be included on the tree, you must have fulfilled all 3 of these criteria:

  • Taken either a SNP or Big Y test
  • Opted in for public sharing
  • More than one result for that branch with the same exact surname

If you think your results should be showing and they aren’t, check your privacy settings by clicking the orange “Manage Personal Information” under your profile picture on your main page, then on the Privacy and Sharing tab.

Still not showing? See if you match another male of the same surname on the Big Y or SNP test at the same level.

If your surname isn’t included, you can recruit testers from that branch of your family.

How Can I Use This?

I’m like a kid with a new toy.

If any of your family surnames are rather unique, search to see if they are on the tree.

Hey look, my Vannoy line is on haplogroup I! Hmmm, clear the schedule, I’m going to be busy all day!

Every haplogroup has a story – and that story belongs to the men, and their families, who carry that haplogroup! I gather the haplogroups for each of my family surnames and this public tree just made this task much, MUCH easier.

Discovering More

If the testers have joined the appropriate surname project, you may also be able to find them in that project to see if they descend from a common line with you. To check and see, click here and then scroll down to the “Search Surname” section of the main Family Tree DNA webpage and enter the surname.

You can see if there is a project for your surname, and if not, your surname may be included in other projects.

Click on any of those links to view the project or contact the (volunteer) project administrators.

Want to search for another surname, the project search box is shown at the right in this view.

What gems can you find?

Want to Test?

If you are a male and you want to take the Big Y test or order a haplogroup confirming SNP, or you are a female who would like to sponsor a test for a male with a surname you’re interested in, you can purchase the Big Y test, here. As a bonus, you will also receive all of the STR markers for genealogical comparison as well.

Wonder what you can learn? You will be searching for matches to other males with the same surname. You can learn about your history. Confirm your ancestral line. Learn where they came from. You can help the scientific effort and contribute to the tree. For more information, read the article, Working with Y DNA – Your Dad’s Story.

Have fun!!!

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

 

 

 

 

Family Tree DNA Names 100,000 New Y DNA SNPs

Recently, Family Tree DNA named 100,000 new SNPs on the Y DNA haplotree, bringing their total to over 153,000. Given that Family Tree DNA does the majority of the Y DNA NGS “full sequence” testing in the industry with their Big Y product, it’s not at all surprising that they have discovered these new SNPs, currently labeled as “Unnamed Variants” on customers’ Big Y Results pages.

The surprising part was twofold:

Family Tree DNA single-handedly propelled science forward with the introduction of the Big Y test. They likely have performed more NGS Y chromosome tests than the entire rest of the world combined. Assuredly, they have commercially.

Originally, in the early 2000s, a new SNP wasn’t named until there were three independent instances of discovery. That pre-NGS “rule” didn’t take into account three men from the same family line because very few men had been tested at that point in time, let alone multiple men from the same family. This type of testing was originally only done in an academic environment. A caveat was put into place by Family Tree DNA when they started discovering SNPs that the 3 individuals had to be from separate family lines and the SNP in question had to be verified by Sanger sequencing before being considered for name assignment and tree placement. At that time, they were pushing the scientific envelope.

In recent years, that criteria changed to two individuals. With this new development, the SNP is being named with one reliable occurrence, BUT, the SNP still is not being placed on the tree without two high quality occurrences.

Naming the SNPs early while awaiting that second occurrence allows discussion about the validity of that particular finding. Family Tree DNA was not the first to move to this practice.

Some time ago, two other firms began analyzing the BAM files produced by Family Tree DNA for an additional analysis fee. Those firms began naming SNPs before three occurrences had been documented, a practice which has been well-accepted by the genetic genealogy community. Everyone seems to be anxious to see their SNP(s) named and placed on the tree, although there is little consensus or standardization about the criteria to place a SNP on the tree or the line between high, medium and low quality SNP read results.

The definition of a new haplogroup, meaning a high quality named SNP, is a new branch in the Y tree. Every new SNP mutation has the potential to be carried for many generations – or to go extinct in one or two.

As the industry has matured, SNP naming procedures have evolved too.

How SNP Names Are Assigned

The lab or entity that discovers a SNP gets to name the SNP. That means that their abbreviation is appended to the beginning of the SNP number, thereby in essence crediting that entity for the discovery. Clearly more conservative namers can’t append their initials to nearly as many SNPs as aggressive namers.

Here’s a list of the naming entities, maintained by ISOGG.

In 2006, the first year that ISOGG compiled a SNP tree, the number of Y DNA haplogroups was 460, including singletons, not tens of thousands. No one would ever have believed this SNP tsunami would happen, let alone in such a short time.

Naming SNPs

Family Tree DNA waiting to name SNPs until 3 were discovered in unrelated family lines, and requiring confirmation by Sanger sequencing allowed the analysis entities to “discover” and name the SNP with their own preceding prefix by implementing less stringent naming criteria. It also increased the possibility of dual naming, a phenomenon that occurs when multiple entities name the same SNP about the same time.

Some people who maintain trees list all of these equivalent SNPs that were named for the exact same mutation, at the same time. Family Tree DNA does not. If the same SNP is named more than once, Family Tree DNA selects one to name the tree branch – in the example below, ZP58. Checking YBrowse, this SNP was also named FGC11161 and ZP56.2.

However, you can see, that SNP ZP58 has several other SNPs keeping it company on the same branch, at least for now.

The FGC SNPs above are only assigned as branch equivalents of ZP58 until a discovery is made that will further divide this branch into two or more branches. That’s how the tree is built.

Sometimes defining a unique SNP is not as straightforward as one would think, especially not utilizing scan technology.

While YFull doesn’t do testing, Full Genomes Corporation does. All of the YFull named SNPs are a result of interpreting BAM files of individuals who have tested elsewhere and naming SNPs that the testing labs didn’t name.

Today, YBrowse, also maintained by ISOGG in conjunction with Thomas Krahn shows the following three organizations with the highest named SNP totals:

  • Family Tree DNA – BY and L prefixes, (L from before the Big Y test) – 153,902
  • YFull – Y prefix – 133,571 (plus 6447 YP SNPs submitted by citizen scientists for verification)
  • Full Genomes Corporation – FGC prefix – 81,363

Just because a SNP is named doesn’t mean that it has been placed on the haplotree. Today, Family Tree DNA has just over 14,100 branches on their tree, with a total of 102,104 SNPs (from all naming sources) placed on their tree. That number increases daily as the following placement criteria is met:

  • Read quality confirmed by the lab
  • Two or more instances of the SNP

SNPs Applied to Family History

All SNPs discovered through the Big Y process and named by Family Tree DNA begin with BY, so my Estes lineage is BY490. This mutation (SNP) occurred since Robert Eastye born in 1555, because one of his son’s descendants carries only BY482 and the descendants of another son carry BY490.

In the pedigree above, kit 166011, to the far right is BY482 and the rest are all BY490, which is one mutation below BY482 on the haplotree.

This means of course that the mutation BY490, occurred someplace between the common ancestor of all of these men, Robert Eastye born in 1555, and Abraham Estes born in 1647. All of Abraham’s descendants carry BY490 along with BY482, but kit 166011 does not. Therefore, we know within two generations of when BY490 occurred. Furthermore, if someone descended from one of Abraham’s brothers (Robert, Silvester, Thomas, Richard, Nicholas or John,) represented on this chart by Richard, we could tell from that result if the mutation occurred between Robert and Silvester, or between Silvester and Abraham.

Unnamed Variants Versus Named SNPs

As it turns out, reserving a location for the Unnamed Variants in the SNP tree is much like making a dinner reservation. It’s yours to claim, assuming everyone shows up.

In the case of Unnamed Variants, Family Tree DNA reserved the SNP name and the SNP will be placed on the tree as soon as a second occurrence is discovered and the SNP is entirely vetted for quality and accuracy. Palindromic and high repeat regions were excluded unless manually verified.

While this article isn’t going to delve into how to determine read quality, every SNP placed on the tree at Family Tree DNA is individually evaluated to assure that they are not being placed erroneously or that a “mutation” isn’t really a misalignment or read issue.

Currently, Family Tree DNA is working their way through the entire haplotree, placing SNPs in the correct location. As you can see, they have more than 100,000 to go and more SNPs are discovered every day.

In the case of the Estes men, you can see their branch placement in the much larger tree.

As we learn more, sometimes branch placements move.

Is Your Unnamed Variant on the List?

ISOGG maintains an index of BY SNPs. BY of course equates to Big Y.

Before using the index, you first need to sign on to your Family Tree DNA account and look at your Unnamed Variants on your Big Y personal page.

If you don’t have any Unnamed Variants, that means all of your Unnamed Variants have already been named. Congratulations!

If you do have Unnamed Variants, click on the position number to take a look on the browser.

This unnamed variant result is clearly a valid read, with almost every forward and reverse read showing the same mutation, all high-quality reads and no “messy” areas nearby that might suggest an alignment issue. You can read more about how to work with your Big Y results in the article, Working With the New Big Y Results (hg38).

Next, go to the ISOGG BY Index page and enter the position number of the variant in the search box – in this case, 13311600.

In this case, 13311600 is not included in the BY Index because YFull already beat Family Tree DNA to the punch and named this SNP.

How do I know that? Because after seeing that there was no result for 13311600 on the ISOGG page, I checked YBrowse.

You can utilize YBrowse to see if an Unnamed Variant has previously been named. You can see the SNP name, Y93760, directly above the left side of the red bar below. The “Y” of course tells you that YFull was the naming entity. (Note that you can click on any image to enlarge.)

YBrowse is more fussy and complex to use than doing the simple ISOGG search. You only need to utilize YBrowse if your Unnamed Variant isn’t listed in the BY ISOGG search tool.

To use YBrowse successfully, you must enter the search in the format of “chrY:13311600..1311600” without the quotation marks and where the number is the variant location, and then click search.

The next Unnamed Variant, 14070341, is included in the ISOGG search list, so no need to utilize YBrowse for this one.

To see the new name that this SNP will be awarded when/if it’s placed on the tree, click on the link “BY SNPs 100K.” You’ll see the page, below.

Then, scroll down or use your browser search to find the variant location.

There we go – this variant will be named BY105782 as soon as Family Tree DNA places it on the tree! I’ll be watching!

Where will it be located on the tree, and will it be the new Estes terminal SNP, meaning the SNP that defines our haplogroup? I can’t wait to find out! It’s so much fun to be a part of scientific discovery.

If you’re a male and haven’t taken the Big Y test, now’s a great timeClick here to order. You can play a role in scientific discovery too. Does your Y DNA carry undiscovered SNPs?

A big thank you to Family Tree DNA for making resources available to answer questions about their new SNPs and naming processes.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Family Tree DNA’s Y-500 is Free for Big Y Customers

Did you notice something new on your Y DNA results page at Family Tree DNA this week? If not quite yet, you will soon if you have taken the Big Y test. There’s a surprise waiting for you. You can sign in here to take a look.

The first thing you might notice is that the Big Y has been renamed to the Big Y500. However, the results I want you to take a look at aren’t under the Big Y500 tab, but on your regular Y DNA Y-STR Results tab. Click to take a look

In the past, 5 panels of Y DNA STR markers have been available:

  • Panel 1 – 1-12 markers
  • Panel 2 – 13-25 markers
  • Panel 3 – 26-37 markers
  • Panel 4 – 38-67 markers
  • Panel 5 – 68-111 markers

Now, a 6th panel has been added:

  • Panel 6 – 112-550 markers

However, there is a difference between the first 5 panels and the 6th panel.

Why is it Called the Y500?

If there is a total of 550 markers reported, why is this product called the Y500?

That’s a great question with an even greater answer.

Family Tree DNA actually tests for a total of 550 markers. Values for markers between 112 and 550 are provided FOR FREE when you take a Big Y test.

Family Tree DNA guarantees that you will receive at least a total of 500 markers, or they will rerun your Big Y test at no cost to you to obtain enough additional markers to reach 500. (The 500 number assumes that you have all 111 STR markers. If you have not tested all of the STR panels, the number will be lower by the number of STR values you haven’t tested. This means that if you took the Y67, but not the Y111, your 500 guarantee number would be 500-44, where 44 is the number of markers in the Y111 panel that you have not yet ordered.)

The best part?

The markers above 111 are ENTIRELY FREE with a Big Y test – for both existing customers who have already taken that test, and all future customers too. Yes, you read that right. If you took the Big Y previously, you are receiving the markers in panel 6, 112-550 absolutely free.

How does it get better than free?

The Big Y Uses a Different Technology

There is a difference between the first 111 markers and the markers from 112-550, meaning that they are read using different technologies

The results for the first 111 STR markers are produced using a technology that targets these specific areas and is very accurate.

The results for the 112-550 markers is produced using next generation sequencing (NGS) on a different testing platform than the Y-111 results. NGS, utilized for the Big Y, scans the Y chromosome rather than targeting specific locations. This scanning process is repeated several times, with values at specific locations recorded.

Scanning

Using NGS technology, your DNA is scanned multiple times, with the number of scans, such as 25 or 30, referred to as the coverage level. The goal is for multiple/most/all scans to find the same value at the same location consistently. Because of the nature of scanning technology, this sometimes doesn’t happen, for various reasons, including “no-calls” which is when for some reason, the scans simply can’t get a reliable read at that location in your DNA. No calls are typical and occur at low levels in everyone’s scan.

Here’s an example from a Big Y scan viewing the actual results using the Big Y chromosome browser.

The blue bars are forward reads and the green bars are reverse reads. Dark blue and dark green bars indicate high quality scans. Medium blue and green are medium quality scans and faintly colored bars indicate poor quality. If you take a look at where the little black arrow at the top is pointing, you can see that a T is the expected value at that location.

When the expected value as determined in the human reference genome is found at that location, nothing is recorded in that column. However, when a different result is discovered, like A in this case, it’s noted and highlighted with pink. We can see that there are 5 As on forward and reverse strands of high quality, then a low quality read, 6 more high quality reads, followed by two reads that show the expected value (nothing recorded) and then three more high quality A reads.

The goal is to determine what actual value resides at that location, and when that value is determined, it’s referred to as a “call.”

For a “call” to be made, meaning the determination of the actual value in that position, the person or software making the call must take several quality factors into consideration.

In this case, the number of high quality reads indicating the derived (mutation) value of “A” allows this location to be definitively called as “A.” Because several other men previously tested have A at this location, a SNP name has already been assigned to this mutation – in this case, A126 in haplogroup R.

However, if you look to the right and left of the arrow to the next two browser locations that contain mutations, you can see in both cases that there are less than half of the column locations that are marked as pink with derived values (mutations), meaning those not expected when compared to the reference model.

These types of locations which are neither clearly ancestral (reference model) nor derived values are when value judgements come into play in terms of deciding which value, the ancestral or derived, is actually present in the DNA of the person being tested.

Some people will call a SNP with only one mutation reported out of 20 or 30 scans. Some people will call a SNP with 2 scans; some with 5, and so forth. Generally, Family Tree DNA uses a minimum threshold of 5 high quality scans to call a mutation value.

Now, let’s talk about how STR values, meaning results displayed in those locations between 112-550, are found in your Big Y NGS data file. You can read about the difference between SNPs and STRs in the article, STRs vs SNPs, Multiple DNA Personalities.

STRs

Short tandem repeats, known as STR values, are the numbers reported in your STR panels. These are stutters of DNA, kind of like the copy machine got stuck in that one area for a few copies.

For example, in haplogroup R, for this person, the value of 13, meaning 13 repeats of a particular sequence, is found at marker DYS393.

Repeated sequences are in essence inserted in-between SNPs in some DNA regions, and the number of repeats reported in STR marker panels is the number of stutters, or repeats, of a particular repeated sequence.

That sounds simpler than it is, because how to count a sequence isn’t always the same. Let’s look at an example showing 20 consecutive DNA positions.

The actual values are shown in the value row. However, these values can be counted in a number of different ways. I’ve also added a “stray read” at location 13 which causes confusion.

At location 13, we show a value of G which does not fit into the repeat pattern. How do we interpret that, and what do we do with it?

The repeat pattern itself is a matter of where you start counting, and how you count.

I’ve color coded the repeats with blue and yellow. Incomplete repeats are red. The stray G in location 13 is green, because it breaks the repeat sequence.

In example 1, we start counting with T in position 1, and there are clearly 3 repeated groups of TACG before we hit our stray G in position 13, which stops the repeat pattern. However, after the stray G, there is one more full repeat sequence of TACG. Do we ignore the G and count the 4th TACG as part of the group, or do we count only the first 3 complete TACG sequences? The total number of repeats could be counted as either 3 or 4, depending on how we interpret the stray G in location 13.

In example 2, we start counting with the GTAC, because I was simulating a reverse read where we start at the end and work backwards. In this case, we clearly have 2 reads, then our stray G which occurs in the middle of a read. Do we ignore that stray G and call the rest of the blue GTAC surrounding the G as a repeat? That blue repeat group is followed by another yellow group. Do we count it at all, or do we simply stop with the marker count of 2 because the G is in the way and breaks the sequence? This repeat sequence could be counted as either 2, 3 or 4, depending on what you do with the G and the following sequence group, both.

Examples 3 and 4 follow the same concept and have the same questions.

All STR sequences face the issue of where to start reading. Where you begin reading can affect the number of repeat counts you wind up with, even without our stray G in position 13.

STR markers obtained from NGS sequencing face this same challenge, but it’s complicated by the issue of no-reads and the call variance that we saw in the chromosome browser where the same location is sometimes called differently on different scans, meaning we really can’t tell which is the actual value. What do we do with those?

All of this is complicated by the fact that some regions of the Y chromosome simply do not produce valid or reliable information. Different (groups of) people define this unreliable region as starting and ending in different locations. Therefore different people analyzing the same information often arrive at different answers to the same question or use marker locations that others don’t.

I suspect all of this may fall into the category of trivia you never wanted to know, but now you’ll understand why you may find different (sometimes strongly held) opinions of what is “right” when two geeky types are arguing strongly about a particular STR value as your eyes glaze over…

Here’s the bottom line – if you’re using results called by the same vendor, you don’t have to worry about whether you and someone else are being accurately compared. You and everyone else at that vendor will have your results reported using the same technology and calling methodology.

Family Tree DNA has always taken a more conservative approach, because they only want to report to customers what they know to be accurate.

You will not see low confidence values on your reports, nor calls from an unreliable region. Genealogists cannot reach reliable genealogical conclusions using unreliable data.

The Big Y 500

Because of the nature of scanned STR results, Family Tree DNA can’t guarantee that you will have a reliable read at every location. In fact, few people will have values at every location. The technology for the Y-111 markers provides a very high level of accuracy and Family Tree DNA will provide results for every 1-111 location unless you actually have a deletion, meaning no DNA in that location. However, the values of markers 112-550 are taken from the Big Y NGS scan.

Therefore, some Big Y customers will have a few markers above 111 that show a “-“ instead of results, such as FTY945 and FTY1025, shown below. A value of “0” found in markers 1-111 means that there is actually no DNA in that location, and it’s not a read error. No DNA at a specific location is heritable, meaning it can serve as a line-marker mutation, while a “no call” means that the scan couldn’t read that genetic address. No calls cannot be compared to others and should be ignored.

Before someone starts to complain about having markers with “no reads,” remember that Family Tree DNA is providing up to 439 additional markers available FOR FREE to customers who have taken (or will take) the Big Y test.

That’s right, there is no charge for these new markers. You are guaranteed 389 additional markers, but you may actually receive as many as 439, depending on how well your DNA reads. The kits I’ve checked have only been missing a couple of marker values, so these kits received 437 additional markers, far above the guaranteed 389.

Right now, matching is not included for the 112-550 markers. Matching above 111 markers may be challenging because while Family Tree DNA does guarantee that you’ll have at least 389 new marker values, those won’t be the same markers above 111 for everyone. In a worst-case scenario, you could mismatch with someone on as many as 100 markers above 111 panel, simply because both you and the person you are matching against are both missing 50 different markers each, for a total of 100 markers mismatching.

Additionally, not everyone has tested all 111 STR markers, and you will receive your 112-550 values if you have taken the Big Y test regardless of whether or not you’ve tested all 111 STR markers.

Matching

Matching on the first 111 markers is reliable because you will have an accurate value, even if the value is 0. Having no DNA at a specific location is a valid result and can be compared to other testers.

With different markers between 112 and 550 missing for different men, matching becomes very tricky. Specifically, how do we interpret mismatches? How many mismatches to we allow to still be considered a reasonable match?

Matching is an entirely different prospect when integrating the markers between 112 and 550 into the equation with a potential of up to 100 mismatching locations in that range simply from no-reads.

I had presumed that Family Tree DNA would offer matching on these additional markers. Presume is a dangerous word, I know. Matching is not offered right now, and given the complexities, I don’t know if matching as we know it will be the future or not, how reliable it would be, or how Family Tree DNA would compensate for the missing STR information that differs with each person’s test.

Furthermore, I’m not quite sure what they would do with two men who haven’t both tested to the same STR level, meaning panels 1-5, but have taken the Big Y so have values for 112-550.

Big Y Purchases

Here’s the status of Big Y tests, today:

  • New Big Y purchase if you have done no Y DNA testing at all – you will now be able to purchase a Big Y without having to previously purchase any STR markers. The 111 STR markers are now bundled into the Big Y purchase, which makes the Big Y appear more expensive than before when the STR markers had to be purchased separately before you could order a Big Y test. The Big Y plus all 111 STR markers is now $649 during the DNA Day Sale, regularly $799.
  • Already tested through 111 STRs – the Big Y is only $349 on sale right now, and $449 regularly, both significantly discounted from just a few months ago.
  • Existing customers who have taken some level of Y STR test but not the Big Y – will have to upgrade their STR test to the 111 level when ordering the Big Y. Those tests are discounted appropriately, shown in the table below.
  • Existing customers who have not tested their STR markers to 111, but have already taken the Big Y – will receive marker values from 112-550. However, they will only receive the Y STR markers below 112 for panels they have paid for. This means that if you have only tested to 37 markers, you will have results for locations 1-37, not for 38-111, but will have results for locations that read from 112-550. This would be the perfect time to upgrade so that you have a complete marker set.

Right now, Family Tree DNA is having their DNA Day Sale and it’s a great time to purchase a Big Y or to upgrade your STR markers if you don’t have the full 111. The sale pricing shown is valid through April 28th. You can click here to order.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

 

Glossary – Terminal SNP

What is a Terminal SNP?

It sounds fatal doesn’t it, but don’t worry, it’s not.

The phrase Terminal SNP is generally used in conjunction with discussing Y DNA testing and haplogroup identification.

SNPs Define Haplogroups

In a nutshell, SNPs, single nucleotide polymorphisms, are the mutations that define different haplogroups. Haplogroups reach far back in time on the direct paternal, generally the surname, line.

SNPs, mutations that define haplogroups are considered to be “once in the lifetime of mankind” events that divide one haplogroup into two subgroups, or branches.

A haplogroup can be thought of as the ancient genetic clan of males – specifically their Y DNA. You might want to read the article, What is a Haplogroup?

If you test your Y DNA with Family Tree DNA, you’ll notice that you receive an estimated haplogroup with the regular Y DNA tests which test STR, or short tandem repeat, markers. STRs are the markers tested in the 37, 67 or 111 marker tests. You can read about the difference between STRs and SNPs in the article, STRs vs SNPs, Multiple DNA Personalities.

STR markers are used for more recent genealogical testing and comparison, while haplogroups reach further back in time.

An estimated haplogroup as provided by Family Tree DNA is based on STR matches to people who have done SNP testing. Estimated haplogroups are quite accurate, as far as they go. However, by necessity, they aren’t deep haplogroups, meaning they aren’t the leaves on the end of the twigs of the branch of your haplotree. Estimated haplogroups are the big branches.

In essence, what a haplogroup provided with STR testing tells you is the name of the town and the main street through town. To get to your house, you may need to turn on a few side streets.

Haplotree

The haplotree, back in the ancient days of 2002 used to hold less than 100 haplogroups, each main branch called by a different letter of the alphabet. The main branches or what is referred to as the core backbone is shown in this graphic from Wikipedia.

Today, the haplotree shown for each Y DNA tester on their personal page at Family Tree DNA, has tens of thousands of branches. No, that’s not a misprint.

The haplotree is the phylogenetic tree that defines all of the branches of mankind and groups them into increasingly refined “clans” or groups, the further down the tree you go.

In other words, Y Adam is at the root, then his “sons” who, due to specific mutations, formed different base haplogroups. As more mutations occurred in the son’s descendants’ lines, more haplogroups were born. Multiply that over tens of thousands of years, and you have lots of branches and twigs and even leaves on the branches of this tree of humanity.

Let’s look at the terminal SNP of my cousin, John, on his Haplotree and SNP page at Family Tree DNA.

John’s terminal SNP is R-BY490. R indicates the main branch and BY490 is the name of the SNP that is the further down the tree – his leaf, for lack of a better definition.

In John’s case, we know this is the smallest leaf on his branch, because he took the Big Y test which reads all of his SNPs on the Y chromosome.

Haplogroup R is quite large with thousands of branches and leaves – each one with its own distinct history that is an important part of your genealogy. Tracking where and when these mutations happened tells you the migration history of your paternal ancestor.

How else would you ever know?

How Do I Discover My Terminal SNP?

Sometimes “terminal SNP” is used to mean the SNP for which a man has most recently tested. It may NOT mean that he has tested for all of the available SNPs. What this really means is that when someone gives you a terminal SNP name, or you see one listed someplace, you’ll need to ask about the depth of the testing undergone by the man in question.

Let’s look at an example.

I’ve condensed John’s tree into only the SNPs for which he tested positive. The entire tree includes SNPs that John tested negative for, and their branches which are not relevant to John – although we certainly didn’t know that they weren’t relevant before he tested. However, he may want to reference the large and accurate scientific tree, so all information is provided to John. It’s like seeing a map that includes all roads, not just the one you’re traveling.

I’ve created a descendant chart style tree below. Y line Adam is the first male. Some several thousands of years later, his descendant had a mutation that created haplogroup R defined by the SNP M207, in yellow.

John, based on his STR matches, was predicted to be R-M269. On his results page, that’s the estimated haplogroup that was showing when his results were first returned.

If you had asked John about his terminal SNP, he would have probably told you R-M269. At that time, to the best of his knowledge, that WAS his terminal SNP – but it wasn’t really.

John could choose three ways to test for additional SNPs to discover his actual terminal SNP.

  • One by One

John could selectively test one SNP at a time to see if he was positive, meaning that he has that mutation. SNPs cost $39 each to test, as of the time this article was written. Of course, John could also be negative for that SNP, meaning he doesn’t have the SNP, and therefore does not descend from that line. That’s good information too, but then John would have to select another branch to test by purchasing the SNP associated with that new branch.

If John had selected any of the SNPs on the list above to test, he would have tested positive. So, let’s say John decided to test L21, a major branch. If he tested positive, that means that all of the branches directly above L21, between L21 and M207, are also positive, by inference.

At that point, John would tell you that his terminal SNP is L21, but it isn’t actually.

  • SNP Packs

Now, John wants to purchase a more cost-effective SNP pack, because he can test 100 or more SNP locations by purchasing one SNP pack for $99. That’s a great value, so John purchases the SNP pack offered on his personal page. A SNP pack tests selective SNPs all over the relevant portion of the tree in an attempt to place a man on a relatively low branch. These SNPs are selected to find an appropriate branch, not the appropriate leaf. They confirm (or disprove) SNPs that have already been discovered.

Let’s say, in John’s case, the SNP pack moves him down to R-ZP21. If you asked him now about his terminal SNP, he would probably tell you R-ZP21, but it still isn’t actually.

SNP packs are great and do move people down the tree, but the only way to move to the end of the twigs is the Big Y test.

  • The Big Y Test

The Big Y test tests for all known SNPs as well as what were called Novel Variants and are now called Unnamed Variants which are new SNPs discovered that are as yet unnamed. You may have a new SNP in your line waiting to be discovered. The Estes family has one dating from sometime before 1495 that, to date, has only been found in Estes descendant males from that common ancestor who was born in 1495.

The Big Y test scans virtually the entire Y chromosome in order to place testers on the lowest leaf of the tree. You can’t get there any other way with certainty and you’ll never know if you have any as yet undiscovered SNPs or leaves unless you take the Big Y.

In John’s case, that leaf was 4 more branches below R-ZP21, at R-BY490.

Why Does a Terminal SNP Matter?

Haplogroup R-M269 is the most common haplogroup of European men.

Looking at the SNP map, you can see that there are so many map locations as to color the map of the UK entirely red.

Genealogically, this isn’t helpful at all.

However, looking now at DF49, below, we see many fewer locations, suggesting perhaps that men with this terminal SNP are clustered in particular areas.

SNPS further down John’s personal haplotree tell an increasingly focused and granular story, each step moving closer in time.

Summary

Men generally want to discover their terminal SNP with the hope that they can learn something interesting about the migration of their ancestors before the genesis of surnames.

Perhaps they will discover that they match all men with McSurnames, suggesting perhaps a Scottish origin. Or maybe their terminal SNP is only found in a mountainous region of Germany, or perhaps their Big Y matches all have patronymic surnames from Scandinavia.

Big Y testing is also a community sourced citizen science effort to expand the Y haplotree – and quite successfully. The vast majority of SNPs on the publicly available ISOGG Y tree today are from individual testers, not from academic studies.

Haplogroups, and therefore terminal SNPs are the only way we have to peek back behind the veil of time.

If you’re interested in discovering your terminal SNP, you’ll be money ahead to simply purchase the Big Y up front and skip individual SNP testing along with SNP packs. In addition to discovering your terminal SNP, you are also matched to other men who have taken the Big Y test.

You can order the Big Y, individual SNPs or SNP packs by clicking on this link, signing on to your account, and then clicking on the blue “Upgrade” button, either in the Y DNA section, shown below, or in the upper right hand corner of your personal page.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Working with Y DNA – Your Dad’s Story

Have you ever wondered why you would want to test your Y DNA? What would a Y DNA test tell you about which ancestors? What would it mean to you and how would it help your genealogy?

If you’re like most genealogists, you want to know every single tidbit you can discover about your ancestors – and Y DNA not only tells males about people they match that are currently living and share ancestors with them at some point in time, but it also reaches back beyond the range of what genealogy in the traditional sense can tell us – past the time when surnames were adopted, peering into the misty veil of the past!

If you aren’t a male, you can’t directly test your Y DNA, because you don’t have a Y chromosome, but that’s OK, because your father or brother or another family member who does carry the same Y chromosome (and surname) as your father may well be willing to test.

What Is Y DNA?

Y DNA a special type of DNA that tells the direct story of your father’s surname line heritage – all the way back as far as we can go – beyond genealogy– to the man from whom we are all descended that we call “Y line Adam.” In the pedigree chart below, Y DNA is represented by the people with blue squares – generally the surname line.

Y DNA is never mixed with the mother’s DNA, so the Y DNA of the blue line of ancestors above remains unbroken and intact and the Y DNA is passed from father to only their male children. The Y chromosome is what makes males male, so females never inherit a Y chromosome. Of course, that means females can’t take Y DNA tests, so they have to ask a family member to test who carries the Y chromosome of the line they are interested in.

Because the surname doesn’t typically change for males between generations, this test is particularly powerful in identifying specific lineages of the male’s surname.  For men looking to identify their paternal line, Y DNA testing is extremely powerful!

Y DNA testing is a great way to determine which ancestral line of a given surname a male descends from.

Want to see how this works?  Family Tree DNA provides 13 great tools for every Y DNA customer. Let’s take a look!

Haplogroup

Everyone who tests their Y DNA at Family Tree DNA receives a haplogroup assignment. Think of a haplogroup as your genetic clan. Haplogroups have a history and a pedigree chart, just like people do. Haplogroups and their branches can identify certain groups of people, such as people of African descent, European, Asian, Jewish and Native American.

While the Y DNA is passed intact with no admixture from the mother, occasionally mutations do happen, and it’s those historical mutations that form clans and branches of clans as generation after generation is born and continues to migrate to new areas.

If you take any Y DNA test at Family Tree DNA, you will receive a haplogroup prediction. In the following example, the gentleman received haplogroup C-P39 as his haplgroup prediction.

Haplogroup predictions from Family Tree DNA are very accurate. They are basic in nature, but detailed enough to identify the continent where your ancestors are found as well as sometimes identifying groups like Jewish or Native American. To receive a more refined haplogroup, additional tests are available (individual SNPs, SNP panels and the Big Y), which confirm the original haplogroup assignment and give you the opportunity to find the smallest branch of the haplotree upon which you reside as a leaf.

Let’s look at an example.

Y haplogroup C arose in Asia and subgroups are found today in parts of Asia, Europe and among Native American men.

Recently, by utilizing the Big Y test, an advanced specialized test that scans the majority of the Y chromosome for mutations, the haplogroup C tree was extended by several branches at Family Tree DNA.

With regular STR marker testing, which is the Y DNA test you purchase from Family Tree DNA,  this particular haplogroup C male had his base haplogroup of C identified along with the additional branch of C-P39. With additional advanced testing of some type, such as individual SNP testing, panels of SNPs available for some haplogroups, or the Big Y test – testers can learn more about their haplogroups – and with the Big Y, virtually everything there is to know about their Y chromosome.

However, until testers receive their regular STR results for their markers, advanced tests aren’t available to order, because testers don’t yet know into which haplogroup, or clan, they will be placed.

The haplogroup C Y-DNA project at Family Tree DNA provides a map of the most distant known ancestors of Haplogroup C members, including all branches, shown below.

Hapologroup C-P39, a Native American subgroup, is found in a much more restricted geography in the Haplogroup C-P39 project, below.

Tools at Family Tree DNA

At Family Tree DNA, your Y haplogroup is shown in the upper right hand corner on your personal page dashboard.

In the Y DNA section, additional tools are shown. Let’s look at each tool and what it can tell you about your direct paternal line.

You can always navigate to the Dashboard or any other option by clicking on the myFTDNA button on the upper left hand corner and then the Y DNA dropdown.

Matches

The first place most people look is at their Matches page. In the case of our example, he has twenty three 111 marker matches ranging from one person with a genetic distance of 1, meaning one mutation difference, to several with 6 mutations difference. The fewer mutations, in general, the most likely the closer in time your most recent common ancestor with your match.

You can see by just looking at the matches below why entering the name of your earliest known ancestor (under Manage Personal Information, Account Settings, Genealogy) is so important!!! That’s the first thing people see and the best indication of a common ancestor. I always include a name, birth/death date and location.

In this case, it’s very clear the common ancestor of most, if not all, of these men is Germain Doucet born in 1641 in Port Royal, Nova Scotia. And before you ask, yes, it’s rather unusual to have an entire list of men descended from one man, but it’s clearly not unheard of.

As you can see, many of these matches (names obscured for privacy) have trees attached to their results and several have also taken the autosomal Family Finder test.

The different Y-DNA haplogroups listed to the right are a function of the “Terminal SNP,” meaning the SNP that tested positive furthest out towards the tip of the branch of the tree. Four matches have had additional SNP testing which shows their terminal SNP to be either Z30754 or M217.

This gentleman can then view his 67, 37, 25 and 12 marker matches by clicking on that dropdown.

He can also e-mail any of his matches by clicking on the envelope icon or view their trees by clicking on the pedigree icon.

Results

Next, let’s look at the Y-STR results for 67 markers. This page should really probably say “raw results,” because as many people say, “it’s just a page of numbers.”

This page shows your values and mutations at specific markers – in other words, what makes you both different from other people and the same as people you match, which means you share a common ancestor at some point in time in the not too distant past.

The beauty of these numbers, is, of course, in what they tell us in context of matching other people. You can’t have matches without these numbers. You also can’t have maps or anything else without the raw mutation information.

HaploTree and SNP Page

STR markers show mutations in recent timeframes, generally within the past 500-800 years, but SNPs take you back into antiquity – just like your family pedigree chart – working from closest to further back in time .

Your Haplotree and SNP page shows you the tree for your haplogroup – in this case C – designated by SNP M216, shown at the very top, along with all branches of the tree. The branches and leaves are color coded based on whether you have tested for that particular SNP, and if so, whether you were positive, meaning you carry the mutation, or negative, meaning you don’t.

SNP Map

The SNP map shows you cluster locations worldwide where any selected SNP is found.

Matches Maps

One of my favorite tools is the Matches Map because it shows the most distant ancestor for all of your matches that have provided that information.

Hint: you MUST enter the geographic information through the link at the bottom of this map (below) for YOUR ancestor to be displayed on THIS map and also on the maps of your matches.

You can also display your match list by clicking on the link beneath the map. You can click on the pins on the map to display the accompanying information.

Note the legend, as your exact matches are shown in red, 1 step mutations in orange, 2 steps in yellow, and so forth. Be sure to look for clusters, and note that if there are multiple people listed in the same location, their pins will stack on top of each other.

For example, in this case, the orange pin shown has two people’s ancestors in that location, including this tester, and a relevant cluster is clearly shown in Nova Scotia.

Migration and Frequency Maps

Are you wondering how your ancestor and his ancestors arrived where you first find them?

The haplogroup Migration Maps shows you the path from Africa to wherever they are found – in this case, the Americas.

The Frequency Map then shows you how much of the New World population is branches of haplogroup C.

Haplogroup Origins

The Haplogroup Origins tool shows the distribution of the haplogroup, by region, by match type and count.  Please note that you can click on any graphic to enlarge.

For example, this person has one 111 marker C-Z30765 match in Canada.

Ancestral Origins

The Ancestral Origins page shows matches by country along with any comments. These matches don’t have any comments, but comments might be Ashkenazi or MDKO (most distant known origin) when US is given.

Advanced Matching Combines Tools

Another of my favorite tools is the Advanced Matching tool, available under the Tools and Apps tab.

Advanced Matches is a wonderful tool that allows you to combine test types. For example, let’s say that you want to know if any of the people you match on the Y DNA test are also showing up as a match on the Family Finder test. You could further limit match results by project as well.

Be sure to click on “show only people I match in all selected tests” or you’ll receive the combined list of all matches, not just the people who match on BOTH tests, which is what you want.

In this example, I’ve selected 12 markers and Family Finder, because I know I’m going to find a few matches for illustration.

Of course, for adoptees, finding someone with whom you match closely on the Family Finder test AND match exactly (or nearly) on the Y DNA test would be very suggestive of a patrilineal common ancestor in a recent timeframe.

Projects

We started our discussion about Y DNA haplogroups by referencing two different haplogroup C projects. Family Tree DNA has over 9000 projects for you to select from.  The good news is that you really don’t have to limit your selections, because you can join an unlimited number of projects.

Thankfully, you don’t have to browse through all the available projects.

  • Haplogroup projects are categorized by Y or mtDNA and then by subgroup where appropriate.
  • Surname projects exist as well and are searchable for your genealogy lines.
  • Geographical projects cover everything else, from geographies such as the Denmark project to the American Indian project.

Some projects focus on Y DNA, some on mtDNA and some include both.  Additionally, some projects welcome people with autosomal results that pertain to that family surname or region.  Every project is run by one or more volunteer administrators that define the focus of the project.

To help people select relevant projects, project administrators can enter surnames that pertain to their project so that Family Tree DNA can match your surname to the project list to provide you with a menu of candidate projects to join.

Of course, you’ll need to read the project description for each project to see if the project actually pertains to you. You can see what is available for other surnames by utilizing the “Search by Surname” function, at the bottom of the menu.

You can also scroll down and browse in a number of ways in addition to surname.

All testers should join their haplogroup project so that everyone can benefit from collaboration.

You can join and manage your projects from your home page by clicking on the Projects tab on the upper left, shown below.

Y DNA Summary

I hope this overview has provided you with some good reasons to test your Y DNA or to better understand your results if you’ve already tested.

If you are a male and are interested in testing a line that is not your surname line, or if you are a female and you can’t test, you can find a male who descends from the ancestral line in question through all males and recruit that gentleman to test.  You can also check existing surname projects to see if someone from your line has already tested.

Y DNA holds the secrets of your patrilineal line. You never know what you don’t know unless you test. You don’t know what kind of surprises are waiting for you – and let’s face it, our ancestors are always full of surprises!

Y DNA Order Options

Family Tree DNA is the only company that offers this type of testing.  Ordering options include 37, 67 and 111 marker tests. You can also order 12 and 25 marker tests within projects. I suggest testing at the highest level the budget will allow, but no less than 37 markers. Most people have matches. Some people have a lot of matches and need the 111 marker test to more fully refine their matches to just the ones that may be genealogically relevant.

You can always upgrade later to a higher marker level later, but the combined original test plus upgrade cost more separately than just purchasing the larger test out the gate. It’s really a personal decision based on your goals and your budget.

Discounts

If you have never tested at Family Tree DNA, you can obtain a discount any day of the week by joining through your surname project. Just click here and then enter your surname into the Project Search box, shown upper right below.  I’ve typed Estes for purposes of illustration.

You will be shown a list of projects (at left above) where the various project administrators have indicated that someone with your surname might be interest in their project. Read the project descriptions, then click on the resulting project that best suits your situation – generally your surname – Estes above for example. You will automatically be joined to the project you select when you order a product, shown below. After you order, you can join multiple projects.

Next, click on the test level you wish to order.

By virtue of comparison, the project pricing for 37, 67 and 111 markers, above, saves you $20 off the regular price if you don’t order through a project.

If you already have a kit number at Family Tree DNA and have ordered other products, you can sign in, upgrade and order your Y DNA test by clicking here.

Happy ancestor hunting!

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Native American Y Haplogroup C-P39 Sprouts Branches!

I am extremely pleased to provide an update on the Haplogroup C-P39 Native American Y DNA project. Marie Rundquist and I as co-administrators have exciting discoveries to share.

As it so happens, this announcement comes almost exactly on the 4th anniversary of the founding of this project at Family Tree DNA. We couldn’t celebrate in a better way!

Native American Y DNA Haplogroups

Haplogroup C is one of two core Native American male haplogroups. Of the two, haplogroup Q is much more prevalent, while haplogroup C is rare. Only some branches of both haplogroup Q and haplogroup C are Native American, with other branches of both haplogroups being Asian and European.

C-P39 is the Native American branch of haplogroup C, and because of its rarity, until now, very little was known. There were no known branches.

In February 2016, Marie Rundquist created a focused project testing plan to upgrade at least one man from each family line to the full 111 markers along with a Big Y test in order to determine if further differentiation could be achieved in the C-P39 haplogroup lineage.

Haplogroup C-P39 Sprouts Branches

In November 2016, Marie presented preliminary research findings at the International Genetic Genealogy Conference in Houston, Texas, with a final evaluation being completed and submitted to Family Tree DNA for review in March 2017. As a result, Marie provides the following press release:

April 29, 2017: Based on a recent “Big Y” DNA novel variant submission from the C-P39 Y DNA project, the Y Tree has been updated by Family Tree DNA scientists. With this latest update, in addition to the C-P39 SNP that distinguishes this haplogroup, there are now new, long-awaited, downstream SNPs and subclades, as reflected in the Y Tree that offer new avenues for research by members of this rare, Native American haplogroup. A summary of new C-P39 Y DNA project subclades follows:

  • North American Appalachian Region: C-P39+ C-BY1360+
  • North American Canada – Multiple Surnames: C-P39+ C-Z30765+
  • North American Canada – Multiple Surnames: C-P39+ C-Z30750+
  • North American Canada: Acadia (Nova Scotia): C-P39+ C-Z30750+
  • North American Canada: Acadia (Nova Scotia): C-P39+ C-Z30754+
  • North American Southwest Region: CP39+ C-Z30747+

The following SNP (BY18405+) was found to have been shared only by two C-P39 project members in the entire Big Y system, as reported here:

  • North American Canada Newfoundland: C-P39+ C-BY18405+
  • North American Canada: Gaspe, QC: C-P39+ C-BY18405+

The ancestors of two families represented in the study, one in the Pacific Northwest and another in the North American Southwest did not experience any mutations in the New World and Big Y results are within the current genetic boundaries of the C-P39 SNP haplogroup as noted.

The Family Tree DNA C-P39 Y DNA Project is managed by Roberta Estes, Administrator, Marie Rundquist, Co-Administrator, and Dr. David Pike, Project Advisor. The “Big Y” DNA test is a product of Family Tree DNA.

Reference: https://www.familytreedna.com/public/ydna_C-P39

The New Tree

The new C-P39 tree at Family Tree DNA is shown, below, including all the new SNPs below P39, a grand total of eight new branches on the C-P39 tree.

It’s just so beautiful to see this in black and white – well, green, black and white. It’s really an amazing accomplishment for citizen scientists to be contributing at this level to the field of genetics.

Beneath C-P39, several sub-branches develop.

  • BY1360 which is represented by a gentleman from Appalachia.
  • BY736 which is represented by two downstream SNPs that include the surnames of both King and Brooms from Canada.
  • Z30747 which is represented by a Garcia from the southwest US, following by downstream subgroup Z30750 represented by a Canadian gentleman, and SNP Z30754 represented by the Acadian Doucette family from Nova Scotia.

This haplotree suggests that the SNP carried by the gentleman from Appalachia is the oldest, with the other sub-branches descending from their common ancient lineage. As you might guess, this isn’t exactly what we had anticipated, but therein lies the thrill of discovery and the promise of science.

The Next Step

Just like with traditional genealogy, this discovery begets more questions. Now, testing needs to be done on additional individuals to see if we can further tease apart relationships and perhaps identify patterns to suggest a migration path. This testing will come, in part, from STR marker testing along with Big Y testing for some lines not yet tested at that level.

We’re also hopeful, of course, that anyone who carries haplogroup C-P39 or any downstream branch will join the C-P39 project. Collaboration is key to discovery.

Contributing

If you would like to donate to the C-P39 project general fund to play a critical role in the next steps of discovery, we would be eternally grateful. At this point, we need to fund at least 4 additional Big Y tests, plus several 111 marker upgrades, totaling about $3000. You can contribute to the project general fund at this link:

https://www.familytreedna.com/group-general-fund-contribution.aspx?g=Y-DNAC-P39

Thank you in advance – every little bit helps!

Kudos

I want to personally congratulate Marie for her hard work and dedication over the past year to bring this monumental discovery and tree update to fruition. It’s truly an incredible accomplishment representing countless hours of behind the scenes work.

Marie and I would both like to thank all of our participants, individuals who contributed funds to the testing, Dr. David Pike as a project advisor and, of course, Family Tree DNA, without whom none of this would be possible.

DNA Testing for Native Heritage

If you are male and have not yet Y DNA tested, but believe that you have a Native ancestor on your direct paternal (surname) line, please order at least the 37 marker test at Family Tree DNA. Your results and who you match will tell that story!

People with Native heritage on any ancestral line are encouraged to join the American Indian Project at Family Tree DNA. If you have tested elsewhere, you can download your results to Family Tree DNA for free.

For additional information about DNA testing for Native American heritage, please read Proving Native American Ancestry Using DNA.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research