Y DNA: Part 2 – The Dictionary of DNA

After my introductory article, Y DNA: Part 1 – Overview, I received several questions about terminology, so this second article will be a dictionary or maybe more like a wiki. Many terms about Y DNA apply to mitochondrial and autosomal as well.

Haplogroup – think of your Y or mitochondrial DNA haplogroup as your genetic clan. Haplogroups are assigned based on SNPs, specific nucleotide mutations that change very occasionally. We don’t know exactly how often, but the general schools of thought are that a new SNP mutation on the Y chromosome occurs someplace between every 80 and 145 years. Of course, those would only be averages. I’ve as many as two mutations in a father son pair, and no mutations for many generations.

Dictionary haplogroup.png

Y DNA haplogroups are quite reliably predicted by STR results at Family Tree DNA, meaning the results of a 12, 25, 37, 67 or 111 marker tests. Haplogroups are only confirmed or expanded from the estimate by SNP testing of the Y chromosome. Predictions are almost always accurate, but only apply to the upper level base haplogroups. I wrote about that in the article, Haplogroups and the Three Brothers.

Haplogroups are also estimated by some companies, specifically 23andMe and LivingDNA who provide autosomal testing. These companies estimate Y and mitochondrial haplogroups by targeting certain haplogroup defining locations in your DNA, both Y and mitochondrial. That doesn’t mean they are actually obtaining Y and mtDNA information from autosomal DNA, just that the chip they are using for DNA processing targets a few Y and mitochondrial locations to be read.

Again, the only way to confirm or expand that haplogroup is to test either your Y or mitochondrial DNA directly. I wrote about that in the article Haplogroup Comparisons Between Family Tree DNA and 23andMe and Why Different Haplogroup Results?.

Nucleotide – DNA is comprised of 4 base nucleotides, abbreviated as T (Thymine), A (Adenine), C (Cytosine) and G (Guanine.) Every DNA address holds one nucleotide.

In the DNA double helix, generally, A pairs with T and C pairs with G.

Dictionary helix structure.png

Looking at this double helix twist, green and purple “ladder rungs” represent the 4 nucleotides. Purple and green and have been assigned to one bonding pair, either A/T or C/G, and red and blue have been assigned to the other pair.

When mutations occur, most often A or T are replaced with their paired nucleotide, as are C and G. In this example, A would be replaced with T and vice versa. C with G and vice versa.

Sometimes that’s not the case and a mutation occurs that pairs A with C or G, for example.

For Y DNA SNPs, we care THAT the mutation occurred, and the identity of the replacing nucleotide so we know if two men match on that SNP. These mutations are what make DNA in general, and Y DNA in particular useful for genealogy.

The rest of this nucleotide information is not something you really need to know, unless of course you’re playing in the jeopardy championship. (Yes, seriously.) The testing lab worries about these things, as well as matching/not matching, so you don’t need to.

SNP – Single nucleotide polymorphism, pronounced “snip.” A mutation that occurs when the nucleotide typically found at a particular location (the ancestral value) is replaced with one of the other three nucleotides (the derived value.) SNPs that mutate are called variants.

In Y DNA, after discovery and confirmation that the SNP mutation is valid and carried by more than one man, the mutation is given a name something like R-M269 where R is the base haplogroup and M269 reflects the lab that discovered and named the SNP (M = Peter Underhill at Stanford) and an additional number, generally the next incremental number named by that lab (269).

Some SNPs were discovered simultaneously by different labs. When that happens, the same mutation in the identical location is given different names by different organizations, resulting in multiple names for the name mutation in the same DNA location. These are considered equivalent SNPs because they are identical.

In some cases, SNPs in different locations seem to define the same tree branching structure. These are functionally equivalent until enough tests are taken to determine a new branching structure, but they are not equivalent in the sense that the exact same DNA location was named by two different labs.

Some confusion exists about Y DNA SNP equivalence.

Equivalence Confusion How This Happens Are They the Same?
Same exact DNA location named by two labs Different SNP names for the same DNA location, named by two different labs at about the same time Exactly equivalent because SNPs are named for the the exact same DNA locations, define only one tree branch ever
Different DNA locations and SNP names, one current tree branch Different SNPs temporarily located on same branch of  the tree because branches or branching structure have not yet been defined When enough men test, different branches will likely be sorted out for the non-equivalent SNPs pointing to newly defined branch locations that divide the tree or branch

Let’s look at an example where 4 example SNPs have been named. Two at the same location, and two more for two additional locations. However, initially, we don’t know how this tree actually looks, meaning what is the base/trunk and what are branches, so we need more tests to identify the actual structure.

Dictionary SNPs before branching.png

The example structure of a haplogroup R branch, above, shows that there are three actual SNP locations that have been named. Location 1 has been given two different SNP names, but they are the same exact location. Duplicate names are not intentionally given, but result from multiple labs making simultaneous discoveries.

However, because we don’t have enough information yet, meaning not enough men have tested that carry at least some of the mutations (variants,), we can’t yet define trunks and branches. Until we do, all 4 SNPs will be grouped together. Examples 1 and 2 will always be equivalent because they are simply different names for the exact same DNA location. Eventually, a branching structure will emerge for Examples 1/2, Example 3 and Example 4..

Dictionary SNP branches.png

Eventually, the downstream branches will be defined and split off. It’s also possible that Example 4 would be the trunk with Examples 1 and 2 forming a branch and Example 3 forming a branch. Branching tree structure can’t be built without sufficient testers who take the NGS tests, specifically the Big Y-700 which doesn’t just confirm a subset of existing named SNPs, but confirms all named SNPs, unnamed variants and discovers new previously-undiscovered variants which define the branching tree structure.

SNP testing occurs in multiple ways, including:

  • NGS, next generation sequencing, tests such as the Big Y-700 which scans the gold standard region of the Y chromosome in order to find known SNPs at specific locations, mutations (variants) not yet named as SNPs, previously undiscovered variants and minimally 700 STR mutations.
  • WGS, whole genome sequencing although there currently exist no bundled commercial tools to separate Y DNA information from the rest of the genome, nor any comparison methodology that allows whole genome information to be transferred to Family Tree DNA, the only commercial lab that does both testing and matching of NGS Y DNA tests and where most of the Y DNA tests reside. There can also be quality issues with whole genome sequencing if the genome is not scanned a similar number of times as the NGS Y tests. The criteria for what constitues a “positive call” for a mutation at a specific location varies as well, with little standardization within the industry.
  • Targeted SNP testing of a specific SNP location. Available at Family Tree DNA  and other labs for some SNP locations, this test would only be done if you are looking for something very specific and know what you are doing. In some cases, a tester will purchase one SNP to verify that they are in a particular lineage, but there is no benefit such as matching. Furthermore, matching on one SNP alone does not confirm a specific lineage. Not all SNPs are individually available for purchase. In fact, as more SNPs are discovered at an astronomical rate, most aren’t available to purchase separately.
  • SNP panels which test a series of SNPs within a certain haplogroup in order to determine if a tester belongs to a specific subclade. These tests only test known SNPs and aren’t tests of discovery, scanning the useable portion of the Y chromosome. In other words, you will discern whether you are or are not a member of the specific subclades being tested for, but you will not learn anything more such as matching to a different subclade, or new, undiscovered variants (mutations) or subclades.

Subclade – A branch of a specific upstream branch of the haplotree.

Dictionary R.png

For example, in haplogroup R, R1 and R2 are subclades of haplogroup R. The graphic above conveys the concept of a subclade. Haplogroups beneath R1 and R2, respectively, are also subclades of haplogroup R as well as subclades of all clades above them on the haplotree.

Older naming conventions used letter number conventions such as R1 and R2 which expanded to R1b1c and so forth, alternating letters and numbers.

Today, we see most haplogroups designated by the haplogroup letter and SNP name. Using that notation methodology, R would be R-M207, R1 would be R-M173 and R2 would be R-M479.

Dictionary R branches.png

ISOGG documents Y haplogroup naming conventions and their history, maintaining both an alphanumeric and SNP tree for backwards compatibility. The reason that the alphanumeric tree was obsoleted was because there was no way to split a haplogroup like R1b1c when a new branch appeared between R1b and R1b1 without renaming everything downstream of R1b, causing constant reshuffling and renaming of tree branches. Haplogroup names were becoming in excess of 20 characters long. Today, the terminal SNP is used as a person’s haplogroup designation. The SNP name never changes and the individual’s Y haplogroup only changes if:

  • Further testing is performed and the tester is discovered to have an additional mutation further downstream from their current terminal SNP
  • A SNP previously discovered using the Big Y NGS test has since been named because enough men were subsequently discovered to carry that mutation, and the newly named SNP is the tester’s terminal SNP

Terminal SNP – It’s really not fatal. Used in this context, “terminal” means end of line, meaning furthest down and closest to present in the haplotree.

Depending on what level of testing you’ve undergone, you may have different haplogroups, or SNPs, assigned as your official “end of line” haplogroup or “terminal SNP” at various times.

If you took any of the various STR panel tests (12, 25, 37, 67 or 111) at Family Tree DNA your SNP was predicted based on STR matches to other men. Let’s say that prediction is R-M198. At that time, R-M198 was your terminal SNP. If you took the Big Y-700 test, your terminal SNP would almost assuredly change to something much further downstream in the haplotree.

If you took an autosomal test, your haplogroup was predicted based on a panel of SNPs selected to be informative about Y or mitochondrial DNA haplogroups. As with predicted haplogroups from STR test panels, the only way to discover a more definitive haplogroup is with further testing.

If you took a Y DNA STR test, you can see by looking at your match list that other testers may have a variety of “terminal SNPs.”

Dictionary Y matches.png

In the above example, the tester was originally predicted as R-M198 but subsequently took a Big Y test. His haplogroup now is R-YP729, a subclade of R-M198 several branches downstream.

Looking at his Y DNA STR matches to view the haplogroups of his matches, we see that the Y DNA predicted or confirmed haplogroup is displayed in the Y-DNA Haplogroup column – and several other men are M198 as well.

Anyone who has taken any type of confirming SNP test, whether it’s an individual SNP test, a panel test or the Big Y has their confirmed haplogroup at that level of testing listed in the Terminal SNP column. What we don’t know and can’t tell is whether the men whose Terminal SNP is listed as R-M198 just tested that SNP or have undergone additional SNP testing downstream and tested negative for other downstream SNPs. We can tell if they have taken the Big Y test by looking at their tests taken, shown by the red arrows above.

If the haplogroup has been confirmed by any form of SNP testing, then the confirmed haplogroup is displayed under the column, “Terminal SNP.” Unfortunately, none of this testers’ matches at this STR marker level have taken the Big Y test. As expected, no one matches him on his Terminal SNP, meaning his SNP farthest down on the tree. To obtain that level of resolution, one would have to take the Big Y test and his matches have not.

Dictionary Y block tree.png

Looking at this tester’s Big Y Block Tree results, we can see that there are indeed 3 people that match him on his terminal SNP, but none of them match him on the STR tests which generally produce genealogical matches closer in time. This suggests that these haplogroup level matches are a result of an ancestor further back in time. Note that these men also have an average of 5 variants each that are currently unnamed. These may eventually be named and become baby branches.

SNP matches can be useful genealogically, depending on when they occurred, or can originate further back in time, perhaps before the advent of surnames.

Our tester’s paternal ancestors migrated from Germany to Hungary in the late 1700s or 1800s, settling in a region now in Croatia, but he’s brick-walled on his paternal line due to record loss during the various wars.

The block tree reveals that the tester’s Big Y SNP match is indeed from Germany, born in 1718, with other men carrying this same terminal SNP originating in both Hungary and Germany even though they aren’t shown as a STR marker match to our tester.

You can read more about the block tree in the article, Family Tree DNA’s New Big Y Block Tree.

Haplotype – your individual values for results of gene sequencing, such as SNPs or STR values tested in the 12, 25, 37, 67 and 111 marker panels at Family Tree DNA. The haplotype for the individual shown below would be 13 for location DYS393, 26 for location DYS390, 16 for location DYS19, and so forth.

Dictionary panel 1.png

The values in a haplotype tend to be inherited together, so they are “unique” to you and your family. In this case, the Y DNA STR values of 13, 26, 16 and 10 are generally inherited together (unless a new mutation occurs,) passed from father to son on the Y chromosome. Therefore, this person’s haplotype is 13, 26, 16 and 10 for these 4 markers.

If this haplotype is rare, it may be very unique to the family. If the haplotype is common, it may only be unique to a much larger haplogroup reaching back hundreds or thousands of years. The larger the haplotype, the more unique it tends to be.

STR – Short tandem repeat. I think of a short tandem repeat as a copy machine or a stutter error. On the Y chromosome, the value of 13 at the location DYS393 above indicates that a series of DNA nucleotides is repeated a total of 13 times.

Indel example 1

Starting with the above example, let’s see how STR values accrue mutations.

STR example

In the example above, the value of CT was repeated 4 times in this DNA sequence, for a total of 5, so 5 would be the marker value.

Indel example 3

DNA can have deletions where the DNA at one or more locations is deleted and no DNA is found at that location, like the missing A above.

DNA can also have insertions where a particular value is inserted one or more times.

Dictionary insertion example.png

For example, if we know to expect the above values at DNA locations 1-10, and an insertion occurs between location 3 and 4, we know that insertion occurred because the alignment of the pattern of values expected in locations 4-10 is off by 1, and an unexpected T is found between 3 and 4, which I’ve labeled 3.1.

Dictionary insertion example 1.png

STR, or copy mutations are different from insertions, deletions or SNP mutations, shown below, where one SNP value is actually changed to another nucleotide.

Indel example 2

Haplotree – the SNP trees of humanity. Just a few years ago, we thought that there were only a few branches on the Y and mitochondrial trees of humanity, but the Big Y test has been a game changer for Y DNA.

At the end of 2019, the tree originating in Africa with Y chromosome Adam whose descendants populated the earth is comprised of more than 217,277 variants divided into 24,838 individual Y haplotree branches

A tree this size is very difficult to visualize, but you can take a look at Family Tree DNA’s public Y DNA tree here, beginning with haplogroup A. Today, there 25,880 branches, increased by more than 1000 branches in less than 3 weeks since year end. This tree is growing at breakneck speed as more men take the Big Y-700 test and new SNPs are discovered.

On the Public Y Tree below, as you expand each haplogroup into subgroups, you’ll see the flags representing the locations of where the testers’ most distant paternal ancestor lived.

Dictionary public tree.png

I wrote about how to use the Y tree in the article Family Tree DNA’s PUBLIC Y DNA Haplotree.

The mitochondrial tree can be viewed here. I wrote about to use the mitochondrial tree in the article Family Tree DNA’s Mitochondrial Haplotree.

Need Something Else?

I’ll be introducing more concepts and terms in future articles on the various Y DNA features. In the mean time, be sure to use the search box located in the upper right-hand corner of the blog to search for any term.

DNAexplain search box.png

For example, want to know what Genetic Distance means for either Y or mitochondrial DNA? Just type “genetic distance” into the search box, minus the quote marks, and press enter.

Enjoy and stay tuned for Part 3 in the Y DNA series, coming soon.

______________________________________________________________

Sign Up Now – It’s Free!

If you enjoyed this article, subscribe to DNAeXplain for free, to automatically receive new articles by emailed each week.

Here’s the link. Just look for the little grey “follow” button on the right-hand side on your computer screen below the black title bar, enter your e-mail address, and you’re good to go!

In case you were wondering, I never have nor ever will share or use your e-mail outside of the intended purpose.

Share the Love

You can always forward these articles to friends or share by posting links on social media. Who do you know that might be interested?

_____________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Products and Services

Genealogy Research

Fun DNA Stuff

  • Celebrate DNA – customized DNA themed t-shirts, bags and other items

Big Y News and Stats + Sale

I must admit – this past January when FamilyTreeDNA announced the Big Y-700, an upgrade from the Big Y-500 product, I was skeptical. I wondered how much benefit testers would really see – but I was game to purchase a couple upgrades – and I did. Then, when the results came back, I purchased more!

I’m very pleased to announce that I’m no longer skeptical. I’m a believer.

The Big Y-700 has produced amazing results – and now FamilyTreeDNA has decoupled the price of the BAM file in addition to announcing substantial sale prices for their Thanksgiving Sale.

I’m going to discuss sale pricing for products other than the Big Y in a separate article because I’d like to focus on the progress that has been made on the phylogenetic tree (and in my own family history) as a result of the Big Y-700 this year.

Big Y Pricing Structure Change

FamilyTreeDNA recently anounced some product structure changes.

The Big Y-700 price has been permanently dropped by $100 by decoupling the BAM file download from the price of the test itself. This accomplishes multiple things:

  • The majority of testers don’t want or need the BAM file, so the price of the test has been dropped by $100 permanently in order to be able to price the Big Y-700 more attractively to encourage more testers. That’s good for all of us!!!
  • For people who ordered the Big Y-700 since November 1, 2019 (when the sale prices began) who do want the BAM file, they can purchase the BAM file separately through the “Add Ons and Upgrades” page, via the “Upgrades” tab for $100 after their test results are returned. There will also be a link on the Big Y-700 results page. The total net price for those testers is exactly the same, but it represents a $100 permanent price drop for everyone else.
  • This BAM file decoupling reduces the initial cost of the Big Y-700 test itself, and everyone still has the option of purchasing the BAM file later, which will make the Big Y-700 test more affordable. Additionally, it allows the tester who wants the BAM file to divide the purchase into two pieces, which will help as well.
  • The current sale price for the Big Y-700 for the tester who has taken NO PREVIOUS Y DNA testing is now just $399, formerly $649. That’s an amazing price drop, about 40%, in the 9 months since the Big Y-700 was introduced!
  • Upgrade pricing is available too, further down in this article.
  • If you order an upgrade from any earlier Big Y to the Big Y-700, you receive an upgraded BAM file because you already paid for the BAM file when you ordered your initial Big Y test.
  • The VCF file is still available for download at no additional cost with any Big Y test.
  • There is no change in the BAM file availability for current customers. Everyone who ordered before November 1, 2019 will be able to download their BAM file as always.

The above changes are permanent, except for the sale price.

2019 has been a Banner Year

I know how successful the Big Y-700 has been for kits and projects that I manage, but how successful has it been overall, in a scientific sense?

I asked FamilyTreeDNA for some stats about the number of SNPs discovered and the number of branches added to the Y phylotree.

Drum roll please…

Branches Added This Year Total Tree Branches Variants Added to Tree This Year Total Variants Added to Tree
2018 6,259 17,958 60,468 132.634
2019 4,394 22.352 32,193 164,827

The tests completed in 2019 are only representative for 10 months, through October, and not the entire year.

Haplotree Branches

Not every SNP discovered results in a new branch being added to the haplotree, but many do. This chart shows the number of actual branches added in 2018 and 2019 to date.

Big Y 700 haplotree branches.png

These stats, provided by FamilyTreeDNA, show the totals in the bottom row, which is a cumulative branch number total, not a monthly total. At the end of October 2019, the total number of individual branches were 22,352.

Big Y 700 haplotree branches small.png

This chart, above, shows some of the smaller haplogroups.

Big Y 700 haplotree branches large.png

This chart shows the larger haplogroups, including massive haplogroup R.

Haplotree Variants

The number of variants listed below is the number of SNPs that have been discovered, named and placed on the tree. You’ll notice that these numbers are a lot larger than the number of branches, above. That’s because roughly 168,000 of these are equivalent SNPs, meaning they don’t further branch the tree – at least not yet. These 168K variants are the candidates to be new branches as more people test and the tree can be further split.

Big Y 700 variants.png

These numbers also don’t include Private Variants, meaning SNPs that have not yet been named.

If you see Private Variants listed in your Big Y results, when enough people have tested positive for the same variant, and it makes sense, the variants will be given a SNP name and placed on the tree.

Big Y 700 variants small.png

The smaller haplogroups variants again, above, followed by the larger, below.

Big Y 700 variants large.png

Upgrades from the Big Y, or Big Y-500 to Big Y-700

Based on what I see in projects, roughly one third of the Big Y and Big Y-500 tests have upgraded to the Big Y-700.

For my Estes line, I wondered how much value the Big Y-700 upgrade would convey, if any, but I’m extremely glad I upgraded several kits. As a result of the Big Y-700, we’ve further divided the sons of Abraham, born in 1747. This granularity wasn’t accomplished by STR testing and wasn’t accomplished by the Big Y or Big Y-500 testing alone – although all of these together are building blocks. I’m ECSTATIC since it’s my own ancestral line that has the new lineage defining SNP.

Big Y 700 Estes.png

Every Estes man descended from Robert born in 1555 has R-BY482.

The sons of the immigrant, Abraham, through his father, Silvester, all have BY490, but the descendants of Silvester’s brother, Robert, do not.

Moses, son of Abraham has ZS3700, but the rest of Abraham’s sons don’t.

Then, someplace in the line of kit 831469, between Moses born in 1711 and the present-day tester, we find a new SNP, BY154784.

Big Y 700 Estes block tree.png

Looking at the block tree, we see the various SNPs that are entirely Estes, except for one gentleman who does not carry the Estes surname. I wrote about the Block Tree, here.

Without Big Y testing, none of these SNPs would have been found, meaning we could never have split these lines genealogically.

Every kit I’ve reviewed carries SNPs that the Big Y-700 has been able to discern that weren’t discovered previously.

Every. Single. One.

Now, even someone who hasn’t tested Y DNA before can get the whole enchilada – meaning 700+ STRs, testing for all previously discovered SNPs, and new branch defining SNPs, like my Estes men – for $399.

If a new Estes tester takes this test, without knowing anything about his genealogy, I can tell him a great deal about where to look for his lineage in the Estes tree.

Reduced Prices

FamilyTreeDNA has made purchasing the Big Y-700 outright, or upgrading, EXTREMELY attractive.

Test Price
Big Y-700 purchase with no previous Y DNA test

 

$399
Y-12 upgrade to Big Y-700 $359
Y-25 upgrade to Big Y-700 $349
Y-37 upgrade to Big Y-700 $319
Y-67 upgrade to Big Y-700 $259
Y-111 upgrade to Big Y-700 $229
Big Y or Big Y-500 upgrade to Big Y-700 $189

Note that the upgrades include all of the STR markers as yet untested. For example, the 12-marker to Big Y-700 includes all of the STRs between 25 and 111, in addition to the Big Y-700 itself. The Big Y-700 includes:

  • All of the already discovered SNPs, called Named Variants, extending your haplogroup all the way to the leaf at the end of your branch
  • Personal and previously undiscovered SNPs called Private Variants
  • All of the untested STR markers inclusive through 111 markers
  • A minimum of a total of 700 STR markers, including markers above 111 that are only available through Big Y-700 testing

With the refinements in the Big Y test over the past few years, and months, the Big Y is increasingly important to genealogy – equally or more so than traditional STR testing. In part, because SNPs are not prone to back mutations, and are therefore more stable than STR markers. Taken together, STRs and SNPs are extremely informative, helping to break down ancestral brick walls for people whose genealogy may not reach far back in time – and even those who do.

If you are a male and have not Y DNA tested, there’s never been a better opportunity. If you are a female, find a male on a brick wall line and sponsor a scholarship.

Click here to order or upgrade!

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

2018 – The Year of the Segment

Looking in the rear view mirror, what a year! Some days it’s been hard to catch your breath things have been moving so fast.

What were the major happenings, how did they affect genetic genealogy and what’s coming in 2019?

The SNiPPY Award

First of all, I’m giving an award this year. The SNiPPY.

Yea, I know it’s kinda hokey, but it’s my way of saying a huge thank you to someone in this field who has made a remarkable contribution and that deserves special recognition.

Who will it be this year?

Drum roll…….

The 2018 SNiPPY goes to…

DNAPainter – The 2018 SNiPPY award goes to DNAPainter, without question. Applause, everyone, applause! And congratulations to Jonny Perl, pictured below at Rootstech!

Jonny Perl created this wonderful, visual tool that allows you to paint your matches with people on your chromosomes, assigning the match to specific ancestors.

I’ve written about how to use the tool  with different vendors results and have discovered many different ways to utilize the painted segments. The DNA Painter User Group is here on Facebook. I use DNAPainter EVERY SINGLE DAY to solve a wide variety of challenges.

What else has happened this year? A lot!

Ancient DNA – Academic research seldom reports on Y and mitochondrial DNA today and is firmly focused on sequencing ancient DNA. Ancient genome sequencing has only recently been developed to a state where at least some remains can be successfully sequenced, but it’s going great guns now. Take a look at Jennifer Raff’s article in Forbes that discusses ancient DNA findings in the Americas, Europe, Southeast Asia and perhaps most surprising, a first generation descendant of a Neanderthal and a Denisovan.

From Early human dispersals within the Americas by Moreno-Mayer et al, Science 07 Dec 2018

Inroads were made into deeper understanding of human migration in the Americas as well in the paper Early human dispersals within the Americas by Moreno-Mayer et al.

I look for 2019 and on into the future to hold many more revelations thanks to ancient DNA sequencing as well as using those sequences to assist in understanding the migration patterns of ancient people that eventually became us.

Barbara Rae-Venter and the Golden State Killer Case

Using techniques that adoptees use to identify their close relatives and eventually, their parents, Barbara Rae-Venter assisted law enforcement with identifying the man, Joseph DeAngelo, accused (not yet convicted) of being the Golden State Killer (GSK).

A very large congratulations to Barbara, a retired patent attorney who is also a genealogist. Nature recognized Ms. Rae-Venter as one of 2018’s 10 People Who Mattered in Science.

DNA in the News

DNA is also represented on the 2018 Nature list by Viviane Slon, a palaeogeneticist who discovered an ancient half Neanderthal, half Denisovan individual and sequenced their DNA and He JianKui, a Chinese scientist who claims to have created a gene-edited baby which has sparked widespread controversy. As of the end of the year, He Jiankui’s research activities have been suspended and he is reportedly sequestered in his apartment, under guard, although the details are far from clear.

In 2013, 23andMe patented the technology for designer babies and I removed my kit from their research program. I was concerned at the time that this technology knife could cut two ways, both for good, eliminating fatal disease-causing mutations and also for ethically questionable practices, such as eugenics. I was told at the time that my fears were unfounded, because that “couldn’t be done.” Well, 5 years later, here we are. I expect the debate about the ethics and eventual regulation of gene-editing will rage globally for years to come.

Elizabeth Warren’s DNA was also in the news when she took a DNA test in response to political challenges. I wrote about what those results meant scientifically, here. This topic became highly volatile and politicized, with everyone seeming to have a very strongly held opinion. Regardless of where you fall on that opinion spectrum (and no, please do not post political comments as they will not be approved), the topic is likely to surface again in 2019 due to the fact that Elizabeth Warren has just today announced her intention to run for President. The good news is that DNA testing will likely be discussed, sparking curiosity in some people, perhaps encouraging them to test. The bad news is that some of the discussion may be unpleasant at best, and incorrect click-bait at worst. We’ve already had a rather unpleasant sampling of this.

Law Enforcement and Genetic Genealogy

The Golden State Killer case sparked widespread controversy about using GedMatch and potentially other genetic genealogy data bases to assist in catching people who have committed violent crimes, such as rape and murder.

GedMatch, the database used for the GSK case has made it very clear in their terms and conditions that DNA matches may be used for both adoptees seeking their families and for other uses, such as law enforcement seeking matches to DNA sequenced during a criminal investigation. Since April 2018, more than 15 cold case investigations have been solved using the same technique and results at GedMatch. Initially some people removed their DNA from GedMatch, but it appears that the overwhelming sentiment, based on uploads, is that people either aren’t concerned or welcome the opportunity for their DNA matches to assist apprehending criminals.

Parabon Nanolabs in May established a genetic genealogy division headed by CeCe Moore who has worked in the adoptee community for the past several years. The division specializes in DNA testing forensic samples and then assisting law enforcement with the associated genetic genealogy.

Currently, GedMatch is the only vendor supporting the use of forensic sample matching. Neither 23anMe nor Ancestry allow uploaded data, and MyHeritage and Family Tree DNA’s terms of service currently preclude this type of use.

MyHeritage

Wow talk about coming onto the DNA world stage with a boom.

MyHeritage went from a somewhat wobbly DNA start about 2 years ago to rolling out a chromosome browser at the end of January and adding important features such as SmartMatching which matches your DNA and your family trees. Add triangulation to this mixture, along with record matching, and you’re got a #1 winning combination.

It was Gilad Japhet, the MyHeritage CEO who at Rootstech who christened 2018 “The Year of the Segment,” and I do believe he was right. Additionally, he announced that MyHeritage partnered with the adoption community by offering 15,000 free kits to adoptees.

In November, MyHeritage hosted MyHeritage LIVE, their first user conference in Oslo, Norway which focused on both their genealogical records offerings as well as DNA. This was a resounding success and I hope MyHeritage will continue to sponsor conferences and invest in DNA. You can test your DNA at MyHeritage or upload your results from other vendors (instructions here). You can follow my journey and the conference in Olso here, here, here, here and here.

GDPR

GDPR caused a lot of misery, and I’m glad the implementation is behind us, but the the ripples will be affecting everyone for years to come.

GDPR, the European Data Protection Regulation which went into effect on May 25,  2018 has been a mixed and confusing bag for genetic genealogy. I think the concept of users being in charge and understanding what is happened with their data, and in this case, their data plus their DNA, is absolutely sound. The requirements however, were created without any consideration to this industry – which is small by comparison to the Googles and Facebooks of the world. However, the Googles and Facebooks of the world along with many larger vendors seem to have skated, at least somewhat.

Other companies shut their doors or restricted their offerings in other ways, such as World Families Network and Oxford Ancestors. Vendors such as Ancestry and Family Tree DNA had to make unpopular changes in how their users interface with their software – in essence making genetic genealogy more difficult without any corresponding positive return. The potential fines, 20 million plus Euro for any company holding data for EU residents made it unwise to ignore the mandates.

In the genetic genealogy space, the shuttering of both YSearch and MitoSearch was heartbreaking, because that was the only location where you could actually compare Y STR and mitochondrial HVR1/2 results. Not everyone uploaded their results, and the sites had not been updated in a number of years, but the closure due to GDPR was still a community loss.

Today, mitoydna.org, a nonprofit comprised of genetic genealogists, is making strides in replacing that lost functionality, plus, hopefully more.

On to more positive events.

Family Tree DNA

In April, Family Tree DNA announced a new version of the Big Y test, the Big Y-500 in which at least 389 additional STR markers are included with the Big Y test, for free. If you’re lucky, you’ll receive between 389 and 439 new markers, depending on how many STR markers above 111 have quality reads. All customers are guaranteed a minimum of 500 STR markers in total. Matching was implemented in December.

These additional STR markers allow genealogists to assemble additional line marker mutations to more granularly identify specific male lineages. In other words, maybe I can finally figure out a line marker mutation that will differentiate my ancestor’s line from other sons of my founding ancestor😊

In June, Family Tree DNA announced that they had named more than 100,000 SNPs which means many haplogroup additions to the Y tree. Then, in September, Family Tree DNA published their Y haplotree, with locations, publicly for all to reference.

I was very pleased to see this development, because Family Tree DNA clearly has the largest Y database in the industry, by far, and now everyone can reap the benefits.

In October, Family Tree DNA published their mitochondrial tree publicly as well, with corresponding haplogroup locations. It’s nice that Family Tree DNA continues to be the science company.

You can test your Y DNA, mitochondrial or autosomal (Family Finder) at Family Tree DNA. They are the only vendor offering full Y and mitochondrial services complete with matching.

2018 Conferences

Of course, there are always the national conferences we’re familiar with, but more and more, online conferences are becoming available, as well as some sessions from the more traditional conferences.

I attended Rootstech in Salt Lake City in February (brrrr), which was lots of fun because I got to meet and visit with so many people including Mags Gaulden, above, who is a WikiTree volunteer and writes at Grandma’s Genes, but as a relatively expensive conference to attend, Rootstech was pretty miserable. Rootstech has reportedly made changes and I hope it’s much better for attendees in 2019. My attendance is very doubtful, although I vacillate back and forth.

On the other hand, the MyHeritage LIVE conference was amazing with both livestreamed and recorded sessions which are now available free here along with many others at Legacy Family Tree Webinars.

Family Tree University held a Virtual DNA Conference in June and those sessions, along with others, are available for subscribers to view.

The Virtual Genealogical Association was formed for those who find it difficult or impossible to participate in local associations. They too are focused on education via webinars.

Genetic Genealogy Ireland continues to provide their yearly conference sessions both livestreamed and recorded for free. These aren’t just for people with Irish genealogy. Everyone can benefit and I enjoy them immensely.

Bottom line, you can sit at home and educate yourself now. Technology is wonderful!

2019 Conferences

In 2019, I’ll be speaking at the National Genealogical Society Family History Conference, Journey of Discovery, in St. Charles, providing the Special Thursday Session titled “DNA: King Arthur’s Mighty Genetic Lightsaber” about how to use DNA to break through brick walls. I’ll also see attendees at Saturday lunch when I’ll be providing a fun session titled “Twists and Turns in the Genetic Road.” This is going to be a great conference with a wonderful lineup of speakers. Hope to see you there.

There may be more speaking engagements at conferences on my 2019 schedule, so stay tuned!

The Leeds Method

In September, Dana Leeds publicized The Leeds Method, another way of grouping your matches that clusters matches in a way that indicates your four grandparents.

I combine the Leeds method with DNAPainter. Great job Dana!

Genetic Affairs

In December, Genetic Affairs introduced an inexpensive subscription reporting and visual clustering methodology, but you can try it for free.

I love this grouping tool. I have already found connections I didn’t know existed previously. I suggest joining the Genetic Affairs User Group on Facebook.

DNAGedcom.com

I wrote an article in January about how to use the DNAGedcom.com client to download the trees of all of your matches and sort to find specific surnames or locations of their ancestors.

However, in December, DNAGedcom.com added another feature with their new DNAGedcom client just released that downloads your match information from all vendors, compiles it and then forms clusters. They have worked with Dana Leeds on this, so it’s a combination of the various methodologies discussed above. I have not worked with the new tool yet, as it has just been released, but Kitty Cooper has and writes about it here.  If you are interested in this approach, I would suggest joining the Facebook DNAGedcom User Group.

Rootsfinder

I have not had a chance to work with Rootsfinder beyond the very basics, but Rootsfinder provides genetic network displays for people that you match, as well as triangulated views. Genetic networks visualizations are great ways to discern patterns. The tool creates match or triangulation groups automatically for you.

Training videos are available at the website and you can join the Rootsfinder DNA Tools group at Facebook.

Chips and Imputation

Illumina, the chip maker that provides the DNA chips that most vendors use to test changed from the OmniExpress to the GSA chip during the past year. Older chips have been available, but won’t be forever.

The newer GSA chip is only partially compatible with the OmniExpress chip, providing limited overlap between the older and the new results. This has forced the vendors to use imputation to equalize the playing field between the chips, so to speak.

This has also caused a significant hardship for GedMatch who is now in the position of trying to match reasonably between many different chips that sometimes overlap minimally. GedMatch introduced Genesis as a sandbox beta version previously, but are now in the process of combining regular GedMatch and Genesis into one. Yes, there are problems and matching challenges. Patience is the key word as the various vendors and GedMatch adapt and improve their required migration to imputation.

DNA Central

In June Blaine Bettinger announced DNACentral, an online monthly or yearly subscription site as well as a monthly newsletter that covers news in the genetic genealogy industry.

Many educators in the industry have created seminars for DNACentral. I just finished recording “Getting the Most out of Y DNA” for Blaine.

Even though I work in this industry, I still subscribed – initially to show support for Blaine, thinking I might not get much out of the newsletter. I’m pleased to say that I was wrong. I enjoy the newsletter and will be watching sessions in the Course Library and the Monthly Webinars soon.

If you or someone you know is looking for “how to” videos for each vendor, DNACentral offers “Now What” courses for Ancestry, MyHeritage, 23andMe, Family Tree DNA and Living DNA in addition to topic specific sessions like the X chromosome, for example.

Social Media

2018 has seen a huge jump in social media usage which is both bad and good. The good news is that many new people are engaged. The bad news is that people often given faulty advice and for new people, it’s very difficult (nigh on impossible) to tell who is credible and who isn’t. I created a Help page for just this reason.

You can help with this issue by recommending subscribing to these three blogs, not just reading an article, to newbies or people seeking answers.

Always feel free to post links to my articles on any social media platform. Share, retweet, whatever it takes to get the words out!

The general genetic genealogy social media group I would recommend if I were to select only one would be Genetic Genealogy Tips and Techniques. It’s quite large but well-managed and remains positive.

I’m a member of many additional groups, several of which are vendor or interest specific.

Genetic Snakeoil

Now the bad news. Everyone had noticed the popularity of DNA testing – including shady characters.

Be careful, very VERY careful who you purchase products from and where you upload your DNA data.

If something is free, and you’re not within a well-known community, then YOU ARE THE PRODUCT. If it sounds too good to be true, it probably is. If it sounds shady or questionable, it’s probably that and more, or less.

If reputable people and vendors tell you that no, they really can’t determine your Native American tribe, for example, no other vendor can either. Just yesterday, a cousin sent me a link to a “tribe” in Canada that will, “for $50, we find one of your aboriginal ancestors and the nation stamps it.” On their list of aboriginal people we find one of my ancestors who, based on mitochondrial DNA tests, is clearly NOT aboriginal. Snake oil comes in lots of flavors with snake oil salesmen looking to prey on other people’s desires.

When considering DNA testing or transfers, make sure you fully understand the terms and conditions, where your DNA is going, who is doing what with it, and your recourse. Yes, read every single word of those terms and conditions. For more about legalities, check out Judy Russell’s blog.

Recommended Vendors

All those DNA tests look yummy-good, but in terms of vendors, I heartily recommend staying within the known credible vendors, as follows (in alphabetical order).

For genetic genealogy for ethnicity AND matching:

  • 23andMe
  • Ancestry
  • Family Tree DNA
  • GedMatch (not a vendor because they don’t test DNA, but a reputable third party)
  • MyHeritage

You can read about Which DNA Test is Best here although I need to update this article to reflect the 2018 additions by MyHeritage.

Understand that both 23andMe and Ancestry will sell your DNA if you consent and if you consent, you will not know who is using your DNA, where, or for what purposes. Neither Family Tree DNA, GedMatch, MyHeritage, Genographic Project, Insitome, Promethease nor LivingDNA sell your DNA.

The next group of vendors offers ethnicity without matching:

  • Genographic Project by National Geographic Society
  • Insitome
  • LivingDNA (currently working on matching, but not released yet)

Health (as a consumer, meaning you receive the results)

Medical (as a contributor, meaning you are contributing your DNA for research)

  • 23andMe
  • Ancestry
  • DNA.Land (not a testing vendor, doesn’t test DNA)

There are a few other niche vendors known for specific things within the genetic genealogy community, many of whom are mentioned in this article, but other than known vendors, buyer beware. If you don’t see them listed or discussed on my blog, there’s probably a reason.

What’s Coming in 2019

Just like we couldn’t have foreseen much of what happened in 2018, we don’t have access to a 2019 crystal ball, but it looks like 2019 is taking off like a rocket. We do know about a few things to look for:

  • MyHeritage is waiting to see if envelope and stamp DNA extractions are successful so that they can be added to their database.
  • www.totheletterDNA.com is extracting (attempting to) and processing DNA from stamps and envelopes for several people in the community. Hopefully they will be successful.
  • LivingDNA has been working on matching since before I met with their representative in October of 2017 in Dublin. They are now in Beta testing for a few individuals, but they have also just changed their DNA processing chip – so how that will affect things and how soon they will have matching ready to roll out the door is unknown.
  • Ancestry did a 2018 ethnicity update, integrating ethnicity more tightly with Genetic Communities, offered genetic traits and made some minor improvements this year, along with adding one questionable feature – showing your matches the location where you live as recorded in your profile. (23andMe subsequently added the same feature.) Ancestry recently said that they are promising exciting new tools for 2019, but somehow I doubt that the chromosome browser that’s been on my Christmas list for years will be forthcoming. Fingers crossed for something new and really useful. In the mean time, we can download our DNA results and upload to MyHeritage, Family Tree DNA and GedMatch for segment matching, as well as utilize Ancestry’s internal matching tools. DNA+tree matching, those green leaf shared ancestor hints, is still their strongest feature.
  • The Family Tree DNA Conference for Project Administrators will be held March 22-24 in Houston this year, and I’m hopeful that they will have new tools and announcements at that event. I’m looking forward to seeing many old friends in Houston in March.

Here’s what I know for sure about 2019 – it’s going to be an amazing year. We as a community and also as individual genealogists will be making incredible discoveries and moving the ball forward. I can hardly wait to see what quandaries I’ve solved a year from now.

What mysteries do you want to unravel?

I’d like to offer a big thank you to everyone who made 2018 wonderful and a big toast to finding lots of new ancestors and breaking down those brick walls in 2019.

Happy New Year!!!

______________________________________________________________

Disclosure

I receive a small contribution when you click on some (but not all) of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Should I Upgrade My Y DNA Test?

I’m often asked about the benefits of upgrading Y DNA tests at Family Tree DNA, and if people should order an upgrade.

The answer to this, like just about everything else DNA is “it depends.”

Yes – Upgrade!

The answer IS YES if:

  • You have tested less than 37 markers. You really need 37 or 67 markers minimally today for genealogy.
  • You want to obtain all of the information possible about your ancestral lineage and where it came from. (That’s me!)
  • You want to participate in family as well as scientific research by upgrading to the Big Y. Why the Big Y? I wrote about that here.
  • You want the most refined haplogroup possible in order to see who you match the most closely that might not be a match on the STR (12-111) panels. This is particularly useful in terms of looking for clan overlap and relatedness further back in time in Scotland, for example.
  • You have lots of matches at your current level and you wish to eliminate the ones that aren’t relevant.
  • You have (your own) surname matches at levels higher than you’ve tested and you want to further determine which matches are closer genealogically.
  • You have no matches at your current level. Sometimes you pick up matches at higher levels because they allow more mutations and your mutations (or their mutations) may simply fall in the lower panels.
  • You want to leave a legacy for future genealogists by providing as much information as possible. This is especially important if you are the last of your line, or males with surname from your family line are in short supply.

No – Maybe Not Now

The answer IS NO if none of the above applies and:

  • You’ve already tested to 37 markers, don’t have matches at lower levels, and you don’t care.
  • You’ve tested to 37 markers, don’t have matches and have to choose between a Y upgrade and a different kind of test, like autosomal or mitochondrial that you haven’t yet taken. You’ll probably learn more by testing an untapped resource.
  • You’ve tested to 37 markers and have to choose between a Y upgrade and a new test for a relative that will provide information about one of your paternal ancestral lines that hasn’t been tested. Hint, look at the surname project in question to be sure your lines aren’t already present.

Surname Project Search

You can search for the surname and projects on the main Family Tree DNA page by scrolling down until you see the surname search box.

Of course, if your ancestor is represented in a public surname project, and you have someone available to test, it’s always a good idea to test that person…well…because you never know if there was an adoption or some hanky panky – or your genealogy is wrong. Better to find out now that to go on blissfully doing genealogy on the wrong line.

Summer Sale is in Full Swing

The great news is that the Family Tree DNA Summer Sale is in full swing, and unlike last year’s sale, upgrades ARE included.

Plus, as an added bonus, when you upgrade to the Big Y-500 test, the markers between where you’ve already tested and the Big Y-500 are included in the price. So if you’ve tested to 37 markers and order the Big Y 500, you receive:

  • 67 marker upgrade
  • 111 marker upgrade
  • Big Y test
  • Additional markers to total 500 above the 111 marker panel – that’s 389 extra markers for free with the Big Y

In essence, this upgrade is 4 tests bundled into one and it’s on sale for less than the Big Y itself used to cost on sale, a year ago, at about $500. This has never been a better value than it is now.

Upgrade prices are shown above and you can order by clicking here and signing on to your account. Then, just click on the blue upgrade button by your Y DNA results.

Need to order a new test, not an upgrade? Great! Click here.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Why Different Haplogroup Results?

“Why do vendors give me different haplogroups?”

This questions often comes up when people test with different vendors and receive different haplogroup results for both Y and mitochondrial DNA.

If you need a quick refresher on who carries which types of DNA, read 4 Kinds of DNA for Genetic Genealogy.

You’re the same person, right, so why would you receive different answers from different testing companies, and which answer is actually right?

The answer is pretty straightforward, conceptually – having to do with how vendors test and interpret your DNA.

Different companies test different pieces of your DNA, depending on:

  • The type of chip the company is using for testing
  • The way they have programmed the chip
  • The version of the reference “tree” they are using to assign haplogroups
  • The level they have decided to report

Therefore, their haplogroups reported may vary, and some may be more exact than others. Occasionally, a vendor outside the major testers is simply wrong.

Not All Tests are Created Equal

All haplogroups carry interesting information and can be at least somewhat genealogically useful. For example, haplogroups alone can tell you if your direct line DNA (paternal or matrilineal) is probably European, Asian, African or Native American. Note the word probably. This too may be subject to interpretation.

A basic haplogroup can rule out a genealogical match through a specific branch, but can’t confirm a genealogical match. You need to compare specific DNA locations not provided with haplogroup testing alone for genealogical matching. Plus you’ll need to add genealogical records where possible.

Let’s look at two examples.

Mitochondrial DNA

Your mitochondrial DNA is inherited from your mother’s direct line, on up you tree until you run out of mothers.  So, you, your mother, her mother, her mother…etc.

The red circles show the mitochondrial lineage in the pedigree chart, below.

If your mitochondrial haplogroup is H1a, for example, then your base haplogroup is “H”, the first branch is “1” and the next smaller branch is “a.”

Therefore, if you don’t match at H, your base haplogroup, you aren’t a possible match on that genealogical line. In other words, if you are H1a, or H plus anything, you can’t match on the direct matrilineal line of someone who is J1a, or J plus anything. H and J are different base haplogroups who haven’t shared a common ancestor in tens of thousands of years.

You can, however, potentially be related on any other line – just not on this specific line.

If your haplogroup does match, even exactly, that doesn’t mean you are related in a genealogically relevant timeframe. It means you share an ancestor, but that common ancestor may be back hundreds, thousands or even tens of thousands of years.

The further downstream, the younger the branches.  “H” is the oldest, then “1,” then “a” is the youngest.

Some companies might just test the locations for H, some for H1 and some for H1a.  Of course, there are even more haplogroups, like H1a2a. New, more refined haplogroups are discovered with each new version of the mitochondrial reference tree.

The only company that tests your haplogroup all the way to the end, meaning the most refined test possible to give you your complete haplogroup and all mutations, is Family Tree DNA with their mtFull Sequence test.

A quick comparison of my mitochondrial DNA at the following three vendors shows the following:

23andMe Living DNA Family Tree DNA Full Seqence
J1c2 J1c J1c2f

With Family Tree DNA’s full sequence test, you’ll receive your full haplogroup along with matching to other people who have taken mitochondrial DNA tests. They are the only vendor to offer Y and mitochondrial matching, because they are the only vendor that tests at that level.

Y DNA

Y DNA operates on the same principle. Specific locations called SNPs are tested by companies like 23andMe and Living DNA to provide customers with a branch level haplogroup. You don’t receive matching with these types of tests.

Just like with mitochondrial DNA, a basic branch level test can eliminate a match on the direct paternal (surname) branch but can’t confirm the genealogical match.

If your haplogroup branch is E-M2 and someone else’s is R-M269, you can’t share a common paternal ancestor because your base haplogroups don’t match, meaning E and R.

You can share an ancestor on any other line, just not on the direct Y line.

The blue squares show the Y DNA lineage on the pedigree chart below.

Family Tree DNA predicts your haplogroup for free if you take the 37, 67 or 111 marker Y-DNA STR test, but if you take the Big Y-500, your Y chromosome is completely tested and your haplogroup defined to the most refined level possible (often called your terminal SNP) – including mutations that may exist in only very few people. You also receive matching to other testers (with any Y test) which can be very genealogically relevant, plus bonus Y STR markers with the Y-500.

OK, But Why Do Different Companies Give Me Different Haplogroup Results?

Great question.

For this example, let’s say your haplogroup is H1a2a.

Let’s say that Company 1 uses a chip that they’ve programmed to test to the H1a level of haplogroup H1a2a.

Let’s say that Company 2 uses a chip that they’ve programmed to test to the H1 level of haplogroup H1a2a.

Let’s say that you take the full sequence test with Family Tree DNA and they fully test all 15,659 locations of your mitochondria and determine that you are H1a2a.

Company 1 will report your mitochondrial haplogroup as H1a, Company 2 as H1 and Family Tree DNA as H1a2a.

With mitochondrial DNA, you can at least see some consist pathway in naming practices, meaning H, H1, H1a, etc., so you can tell that you’re on the same branch.

With Y DNA, the only consistent part is the base haplogroup.

With Y DNA, let’s say that Company 1 programs their chip to test for specific SNP  locations, and they return a Y DNA haplogroup of R-L21.

Company 2 programs their chip to test for fewer or different locations and they return a Y DNA haplogroup of R-M269.

You purchase a Big Y-500 test at Family Tree DNA, and they return your haplogroup as R-CTS3386.

All three haplogroups can be correct, as far as they go. It’s just that they don’t test the same distance down the Y chromosome tree.

R-M269, R-L21 and R-CTS3386 are all increasingly smaller branches on the Y haplotree.

Furthermore, for both Y and mitochondrial DNA, there is always a remote possibility that a critical location won’t be able to be read in your DNA sample that might affect your haplogroup.

Obtaining Your Haplogroup

I strongly encourage people to test with and upload to only well-known major companies or organizations. Some companies provide haplogroup information that is simply wrong.

Companies that I am comfortable with relative to haplogroups include:

Neither MyHeritage nor Ancestry provide Y or mitochondrial haplogroups.

The chart below shows the various vendor offerings, including Y and mitochondrial DNA matching.

Company Offerings Matching
Family Tree DNA – Y DNA Y haplogroup is estimated with STR test. Haplogroup provided to most refined level possible with Big Y-500 test. Individual SNP tests also available. Yes
Family Tree DNA – mitochondrial At least base haplogroup provided with mtPlus test, plus more if possible, but full haplogroup plus additional mutations provided with mtFull Sequence test. Yes
Genographic Project (obsolete in 2019) More than base haplogroup for both Y and mitochondrial, but not full haplogroup on either. No
23andMe More than base haplogroup for both Y and mitochondrial, but not full haplogroup on either. No
Living DNA More than base haplogroup for both Y and mitochondrial, but not full haplogroup on either. No

Want More Detail?

If you’d like to read a more detailed answer about how haplogroups are determined, take a look at the article, Haplogroup Comparisons Between Family Tree DNA and 23andMe.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Family Tree DNA’s Y-500 is Free for Big Y Customers

Did you notice something new on your Y DNA results page at Family Tree DNA this week? If not quite yet, you will soon if you have taken the Big Y test. There’s a surprise waiting for you. You can sign in here to take a look.

The first thing you might notice is that the Big Y has been renamed to the Big Y500. However, the results I want you to take a look at aren’t under the Big Y500 tab, but on your regular Y DNA Y-STR Results tab. Click to take a look

In the past, 5 panels of Y DNA STR markers have been available:

  • Panel 1 – 1-12 markers
  • Panel 2 – 13-25 markers
  • Panel 3 – 26-37 markers
  • Panel 4 – 38-67 markers
  • Panel 5 – 68-111 markers

Now, a 6th panel has been added:

  • Panel 6 – 112-550 markers

However, there is a difference between the first 5 panels and the 6th panel.

Why is it Called the Y500?

If there is a total of 550 markers reported, why is this product called the Y500?

That’s a great question with an even greater answer.

Family Tree DNA actually tests for a total of 550 markers. Values for markers between 112 and 550 are provided FOR FREE when you take a Big Y test.

Family Tree DNA guarantees that you will receive at least a total of 500 markers, or they will rerun your Big Y test at no cost to you to obtain enough additional markers to reach 500. (The 500 number assumes that you have all 111 STR markers. If you have not tested all of the STR panels, the number will be lower by the number of STR values you haven’t tested. This means that if you took the Y67, but not the Y111, your 500 guarantee number would be 500-44, where 44 is the number of markers in the Y111 panel that you have not yet ordered.)

The best part?

The markers above 111 are ENTIRELY FREE with a Big Y test – for both existing customers who have already taken that test, and all future customers too. Yes, you read that right. If you took the Big Y previously, you are receiving the markers in panel 6, 112-550 absolutely free.

How does it get better than free?

The Big Y Uses a Different Technology

There is a difference between the first 111 markers and the markers from 112-550, meaning that they are read using different technologies

The results for the first 111 STR markers are produced using a technology that targets these specific areas and is very accurate.

The results for the 112-550 markers is produced using next generation sequencing (NGS) on a different testing platform than the Y-111 results. NGS, utilized for the Big Y, scans the Y chromosome rather than targeting specific locations. This scanning process is repeated several times, with values at specific locations recorded.

Scanning

Using NGS technology, your DNA is scanned multiple times, with the number of scans, such as 25 or 30, referred to as the coverage level. The goal is for multiple/most/all scans to find the same value at the same location consistently. Because of the nature of scanning technology, this sometimes doesn’t happen, for various reasons, including “no-calls” which is when for some reason, the scans simply can’t get a reliable read at that location in your DNA. No calls are typical and occur at low levels in everyone’s scan.

Here’s an example from a Big Y scan viewing the actual results using the Big Y chromosome browser.

The blue bars are forward reads and the green bars are reverse reads. Dark blue and dark green bars indicate high quality scans. Medium blue and green are medium quality scans and faintly colored bars indicate poor quality. If you take a look at where the little black arrow at the top is pointing, you can see that a T is the expected value at that location.

When the expected value as determined in the human reference genome is found at that location, nothing is recorded in that column. However, when a different result is discovered, like A in this case, it’s noted and highlighted with pink. We can see that there are 5 As on forward and reverse strands of high quality, then a low quality read, 6 more high quality reads, followed by two reads that show the expected value (nothing recorded) and then three more high quality A reads.

The goal is to determine what actual value resides at that location, and when that value is determined, it’s referred to as a “call.”

For a “call” to be made, meaning the determination of the actual value in that position, the person or software making the call must take several quality factors into consideration.

In this case, the number of high quality reads indicating the derived (mutation) value of “A” allows this location to be definitively called as “A.” Because several other men previously tested have A at this location, a SNP name has already been assigned to this mutation – in this case, A126 in haplogroup R.

However, if you look to the right and left of the arrow to the next two browser locations that contain mutations, you can see in both cases that there are less than half of the column locations that are marked as pink with derived values (mutations), meaning those not expected when compared to the reference model.

These types of locations which are neither clearly ancestral (reference model) nor derived values are when value judgements come into play in terms of deciding which value, the ancestral or derived, is actually present in the DNA of the person being tested.

Some people will call a SNP with only one mutation reported out of 20 or 30 scans. Some people will call a SNP with 2 scans; some with 5, and so forth. Generally, Family Tree DNA uses a minimum threshold of 5 high quality scans to call a mutation value.

Now, let’s talk about how STR values, meaning results displayed in those locations between 112-550, are found in your Big Y NGS data file. You can read about the difference between SNPs and STRs in the article, STRs vs SNPs, Multiple DNA Personalities.

STRs

Short tandem repeats, known as STR values, are the numbers reported in your STR panels. These are stutters of DNA, kind of like the copy machine got stuck in that one area for a few copies.

For example, in haplogroup R, for this person, the value of 13, meaning 13 repeats of a particular sequence, is found at marker DYS393.

Repeated sequences are in essence inserted in-between SNPs in some DNA regions, and the number of repeats reported in STR marker panels is the number of stutters, or repeats, of a particular repeated sequence.

That sounds simpler than it is, because how to count a sequence isn’t always the same. Let’s look at an example showing 20 consecutive DNA positions.

The actual values are shown in the value row. However, these values can be counted in a number of different ways. I’ve also added a “stray read” at location 13 which causes confusion.

At location 13, we show a value of G which does not fit into the repeat pattern. How do we interpret that, and what do we do with it?

The repeat pattern itself is a matter of where you start counting, and how you count.

I’ve color coded the repeats with blue and yellow. Incomplete repeats are red. The stray G in location 13 is green, because it breaks the repeat sequence.

In example 1, we start counting with T in position 1, and there are clearly 3 repeated groups of TACG before we hit our stray G in position 13, which stops the repeat pattern. However, after the stray G, there is one more full repeat sequence of TACG. Do we ignore the G and count the 4th TACG as part of the group, or do we count only the first 3 complete TACG sequences? The total number of repeats could be counted as either 3 or 4, depending on how we interpret the stray G in location 13.

In example 2, we start counting with the GTAC, because I was simulating a reverse read where we start at the end and work backwards. In this case, we clearly have 2 reads, then our stray G which occurs in the middle of a read. Do we ignore that stray G and call the rest of the blue GTAC surrounding the G as a repeat? That blue repeat group is followed by another yellow group. Do we count it at all, or do we simply stop with the marker count of 2 because the G is in the way and breaks the sequence? This repeat sequence could be counted as either 2, 3 or 4, depending on what you do with the G and the following sequence group, both.

Examples 3 and 4 follow the same concept and have the same questions.

All STR sequences face the issue of where to start reading. Where you begin reading can affect the number of repeat counts you wind up with, even without our stray G in position 13.

STR markers obtained from NGS sequencing face this same challenge, but it’s complicated by the issue of no-reads and the call variance that we saw in the chromosome browser where the same location is sometimes called differently on different scans, meaning we really can’t tell which is the actual value. What do we do with those?

All of this is complicated by the fact that some regions of the Y chromosome simply do not produce valid or reliable information. Different (groups of) people define this unreliable region as starting and ending in different locations. Therefore different people analyzing the same information often arrive at different answers to the same question or use marker locations that others don’t.

I suspect all of this may fall into the category of trivia you never wanted to know, but now you’ll understand why you may find different (sometimes strongly held) opinions of what is “right” when two geeky types are arguing strongly about a particular STR value as your eyes glaze over…

Here’s the bottom line – if you’re using results called by the same vendor, you don’t have to worry about whether you and someone else are being accurately compared. You and everyone else at that vendor will have your results reported using the same technology and calling methodology.

Family Tree DNA has always taken a more conservative approach, because they only want to report to customers what they know to be accurate.

You will not see low confidence values on your reports, nor calls from an unreliable region. Genealogists cannot reach reliable genealogical conclusions using unreliable data.

The Big Y 500

Because of the nature of scanned STR results, Family Tree DNA can’t guarantee that you will have a reliable read at every location. In fact, few people will have values at every location. The technology for the Y-111 markers provides a very high level of accuracy and Family Tree DNA will provide results for every 1-111 location unless you actually have a deletion, meaning no DNA in that location. However, the values of markers 112-550 are taken from the Big Y NGS scan.

Therefore, some Big Y customers will have a few markers above 111 that show a “-“ instead of results, such as FTY945 and FTY1025, shown below. A value of “0” found in markers 1-111 means that there is actually no DNA in that location, and it’s not a read error. No DNA at a specific location is heritable, meaning it can serve as a line-marker mutation, while a “no call” means that the scan couldn’t read that genetic address. No calls cannot be compared to others and should be ignored.

Before someone starts to complain about having markers with “no reads,” remember that Family Tree DNA is providing up to 439 additional markers available FOR FREE to customers who have taken (or will take) the Big Y test.

That’s right, there is no charge for these new markers. You are guaranteed 389 additional markers, but you may actually receive as many as 439, depending on how well your DNA reads. The kits I’ve checked have only been missing a couple of marker values, so these kits received 437 additional markers, far above the guaranteed 389.

Right now, matching is not included for the 112-550 markers. Matching above 111 markers may be challenging because while Family Tree DNA does guarantee that you’ll have at least 389 new marker values, those won’t be the same markers above 111 for everyone. In a worst-case scenario, you could mismatch with someone on as many as 100 markers above 111 panel, simply because both you and the person you are matching against are both missing 50 different markers each, for a total of 100 markers mismatching.

Additionally, not everyone has tested all 111 STR markers, and you will receive your 112-550 values if you have taken the Big Y test regardless of whether or not you’ve tested all 111 STR markers.

Matching

Matching on the first 111 markers is reliable because you will have an accurate value, even if the value is 0. Having no DNA at a specific location is a valid result and can be compared to other testers.

With different markers between 112 and 550 missing for different men, matching becomes very tricky. Specifically, how do we interpret mismatches? How many mismatches to we allow to still be considered a reasonable match?

Matching is an entirely different prospect when integrating the markers between 112 and 550 into the equation with a potential of up to 100 mismatching locations in that range simply from no-reads.

I had presumed that Family Tree DNA would offer matching on these additional markers. Presume is a dangerous word, I know. Matching is not offered right now, and given the complexities, I don’t know if matching as we know it will be the future or not, how reliable it would be, or how Family Tree DNA would compensate for the missing STR information that differs with each person’s test.

Furthermore, I’m not quite sure what they would do with two men who haven’t both tested to the same STR level, meaning panels 1-5, but have taken the Big Y so have values for 112-550.

Big Y Purchases

Here’s the status of Big Y tests, today:

  • New Big Y purchase if you have done no Y DNA testing at all – you will now be able to purchase a Big Y without having to previously purchase any STR markers. The 111 STR markers are now bundled into the Big Y purchase, which makes the Big Y appear more expensive than before when the STR markers had to be purchased separately before you could order a Big Y test. The Big Y plus all 111 STR markers is now $649 during the DNA Day Sale, regularly $799.
  • Already tested through 111 STRs – the Big Y is only $349 on sale right now, and $449 regularly, both significantly discounted from just a few months ago.
  • Existing customers who have taken some level of Y STR test but not the Big Y – will have to upgrade their STR test to the 111 level when ordering the Big Y. Those tests are discounted appropriately, shown in the table below.
  • Existing customers who have not tested their STR markers to 111, but have already taken the Big Y – will receive marker values from 112-550. However, they will only receive the Y STR markers below 112 for panels they have paid for. This means that if you have only tested to 37 markers, you will have results for locations 1-37, not for 38-111, but will have results for locations that read from 112-550. This would be the perfect time to upgrade so that you have a complete marker set.

Right now, Family Tree DNA is having their DNA Day Sale and it’s a great time to purchase a Big Y or to upgrade your STR markers if you don’t have the full 111. The sale pricing shown is valid through April 28th. You can click here to order.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research