Haplogroups: DNA SNPs Are Breadcrumbs – Follow Their Path

Recently a reader asked some great questions.

If Y-DNA is unchanged, then why isn’t the Y-DNA of every man the same today? And if it’s not the same, then how do we know that all men descend from Y-Adam? Are the scientists just guessing?

The scientists aren’t guessing, and the recent scientific innovations behind how this works is pretty amazing, so let’s unravel these questions one at a time.

The first thing we need to understand is how Y-DNA is inherited differently from autosomal DNA, and how it mutates.

First, a reminder that:

  • Y-DNA tests the Y chromosome passed from father to son in every generation, unmixed with any DNA of the mother. This article focuses on Y-DNA.
  • Mitochondrial DNA tests the mitochondria passed from mothers to all of their children, but is only passed on by the females, unmixed with the DNA of the father. This article also pertains to mitochondrial SNPS, but we will cover that more specifically later in another article.
  • Autosomal DNA is passed from both parents to their children. Each child inherits half of each parent’s autosomal DNA.

Let’s look at how this works.

Autosomal vs Y-DNA Inheritance

Click on image to enlarge

Autosomal DNA, shown here with the green (male) and pink (female) images, divides in each generation as it’s passed from the parent to their child. Each child inherits half of each parent’s autosomal DNA, meaning chromosomes 1-22. For this discussion, each descendant shown above is a male and has a Y chromosome.

This means that in the first generation, which would be the great-grandfather, about 700,000 locations of his green autosomal DNA are tested for genealogy purposes.

His female partner (pink) also has about 700,000 locations. During recombination, they each contribute about 350,000 SNPs (Single Nucleotide Polymorphisms) of autosomal DNA to their child. Their offspring then has a total of 700,000 SNPs, 350,000 green and 350,000 pink contributed by each parent.

This process is repeated for each child, whether male or female (with the exception of the X chromosome, which is beyond the scope of this article), but each child does not receive exactly the same half of their parents’ autosomal DNA. Recombination is random.

In the four generations shown above, the green autosomal DNA of generation one, the great-grandfather, has been divided and recombined three times. The original 700,000 locations of great-grandfather’s green DNA has now been whittled down to about 87,500 locations of his green DNA.

Y-DNA in the Same Generation

Looking now at the blue Y-DNA at left, the Y-DNA remains the same in each generation with the exception of one mutation approximately every two or three generations.

As you can see in the chart, in the exact same number of generations, the Y-DNA of each male, which he inherited from his father:

  • Never recombines with any DNA from the mother
  • Never divides and gets smaller in subsequent generations
  • Remains essentially unchanged in each generation

The key word here is “essentially.”

Y-DNA

The Y chromosome consists of about 59 million locations or SNPs of DNA. STR tests, Short Tandem Repeats, which are essentially insertions and deletions, test limited numbers of carefully curated markers selected for the fact that they mutate in a genealogically relevant timeframe. These markers are combined in panels of either 67 or 111 marker tests available for purchase at FamilyTreeDNA today, or historically 12, 25, 37, 67, and 111 marker panels. The STR test was the original Y-DNA test for genealogy and is still used as an introductory test or to see if a male matches a specific line, or not.

From the STR tests, in addition to matching, FamilyTreeDNA can reliably predict a relatively high-level haplogroup, or genetic clan, based on the frequency of the combinations of those marker values in specific STR locations.

SNPs are much more reliable than STRs, which tend to be comparatively unstable, mutating at an unreliable rate, and back mutating, which can be very disconcerting for genealogy. We need reliable consistency to be able to assign a male tester to a specific lineage with confidence. We can, however, find genealogically relevant matches that may be quite important, so I never disregard STR tests or testers. STR tests aren’t relevant for deeper history, nor can they reliably discern a specific lineage within a surname. SNP tests can and do.

The Big Y-700 SNP test gives us that and more, along with the earlier Big Y-500 test which scanned about 30 million locations. The Big Y-700 is a significant improvement; men can upgrade from the Big Y-500 or STR tests.

The Big Y-700 test scans about 50 million Y-DNA locations, known as the gold standard region, for all mutations. It reports 700 or more STR markers for matching, but more importantly, it scans for all SNP mutations in those 50 million locations.

All mutations are confirmed by at least five positive repeat scans and are then assigned a haplogroup name if found in two or more men.

Y-DNA Testing

If Y-DNA remained exactly the same, then the Y-DNA of men today would be entirely indistinguishable from each other – essentially all matching humankind’s first common ancestor. With no changes, Y-DNA would not be useful for genealogy. We need inherited mutations to be able to compare men and determine their level of relatedness to each other.

Fortunately, Y-DNA SNPs do mutate. Y-DNA is never divided or combined, so it stays essentially the same except for occasional mutations which are inherited by the following generations.

Using SNP markers scanned in the Big Y test, one new mutation happens on the average of every two or three generations. Of course, that means that sometimes there are no mutations for a few generations, and sometimes there are two mutations between father and son.

What this does, though, very effectively, is provide a trail of SNP mutations – breadcrumbs essentially – that we can use for matching, AND for tracking our mutations, which equate to ancestors, back in time.

Estes Male Breadcrumb Trail

I’ve tested several Estes men of known lineage, so I’m going to use this line as an example of how mutations act as breadcrumbs, allowing us to track our ancestors back in time and across the globe.

Multiple cousins in my Estes line have taken the Big Y-700 test.

My closest male cousin matches two other men on a unique mutation. That SNP has been named haplogroup R-ZS3700.

We know, based on our genealogy, that this mutation occurred in Virginia and is found in the sons of Moses Estes born in 1711.

How do we know that?

We know that because three of Moses’s descendants have tested and all three of those men have the same mutation, R-ZS3700, and none of the sons of Moses’s brothers have that mutation.

I’ve created a chart to illustrate the Estes pedigree chart, and the haplogroups assigned to those men. So, it’s a DNA pedigree chart too. This is exactly what the Big-Y DNA test does for us.

In the red-bordered block of testers, you can see the three men that all have R-ZS3700 (in red), and all are sons of Moses born in 1711. I have not typed the names of all the men in each generation because, for purposes of this illustration, names aren’t important. However, the concept and the fact that we have been able to connect them genealogically, either before or because of Y-DNA testing, is crucial.

Directly above Moses born in 1711, you can see his father Abraham born in 1647, along with Moses’ brothers at right and left; John, Richard, Sylvester, and Elisha whose descendants have taken the Big Y-700 test. Moses’s brothers’ descendants all have haplogroup R-BY490 (in blue), but NOT R-ZS3700. That tells us that the mutation responsible for R-ZS3700 happened between Abraham born in 1647, and Moses born in 1711. Otherwise, Moses’s brothers would have the mutation if his father had the mutation.

Moses’s descendants also have R-BY490, but it’s NOT the last SNP or haplogroup in their lineage. For Moses’s descendants, R-ZS3700 occurred after R-BY490.

You can see haplogroup R-BY490 boxed in blue.

We know that Moses and his father, Abraham, both have haplogroup R-BY490 because all of Abraham’s sons have this haplogroup. Additionally, we know that Abraham’s father, Silvester also had haplogroup R-BY490.

How do we know that?

Abraham’s brother, Richard’s descendant, tested and he has haplogroup R-BY490.

However, Silvester’s father, Robert born in 1555 did NOT have R-BY490, so it formed between him and his son, Silvester.

How do we know that?

Robert’s other son, Robert born in 1603 has a descendant who tested and has haplogroup R-BY482, but does NOT have R-BY490 or R-ZS3700.

All of the other Eates testers also have R-BY482, blocked in green, in addition to R-BY490, so we know that the mutation of R-BY490 developed between Robert born in 1555 and his son, Silvester born in 1600, because his other son’s descendant does not have it.

Looking at only the descent of the haplogroups, in order, we have

  • R-BY482 (green) found in Robert born in 1555 and all of his descendants.
  • R-BY490 (blue) found in Silvester born in 1600 and all of his descendants, but not his brother
  • R-ZS3700 (red) found in Moses born in 1711 and all of his descendants, but not his brothers

If we had Estes men who descend from the two additional documented generations upstream of Robert born in 1555, we might discover when R-BY482 occurred, but to date, we don’t have any additional testers from those lines.

Now that we understand the genesis of these three haplogroups in the Estes lineage, what else can we discover through our haplogroup breadcrumbs?

The Discover Reports

By entering the haplogroup in the Discover tool, either on the public page, here, or clicking on Discover on your personal page at FamilyTreeDNA if you’ve taken the Big-Y test, you will see several reports for your haplogroup.

I strongly suggest reviewing each category, because they cumulatively act as chapters to the book of your haplogroup story, but we’re going to skip directly to the breadcrumbs, which is called the Ancestral Path.

The Ancestral Path begins with your haplogroup in Line 1 then lists the first upstream or parent haplogroup in Line 2. In this case, the haplogroup I entered is R-ZS3700.

You can see the estimated age of the haplogroup, meaning when it formed, at about 1700 CE. Moses Estes who was born in 1711 is the first Estes man to carry haplogroup R-ZS3700, so that’s extremely close.

Line 2, R-BY490 occurred or was born about 1650, and we know that it actually occurred between Robert and Silvester born in 1600, so that’s close too.

Scanning down to Line 3, R-BY482 is estimated to have occurred about 1500 CE, and we know for sure it had occurred by 1555 when Robert was born.

We see the parent haplogroup of R-BY487 on Line 4, dating from about 750 CE. Of course, if more men test, it’s possible that more haplogroups will emerge between BY482 and BY487, forming a new branch. Given the time involved, those men wouldn’t be expected to carry the Estes surname, as surnames hadn’t yet been adopted in that timeframe.

Moving down to Line 9, we see R-ZP18 from 2250 BCE, or about 4250 years ago. Looking at the right column, there’s one ancient sample with that haplogroup. The location of ancient samples anchors haplogroups definitively in a particular location at a specific time.

Haplogroup by haplogroup, step by step, we can follow the breadcrumbs back in time to Y-Adam, the first homo sapiens male known to have descendants today, meaning he’s the MRCA, or most recent common ancestor for all men.

Neanderthals and Denisovans follow, but their Y-DNA is only available through ancient samples. They have no known direct male survivors, but someday, maybe someone will test and their Y-DNA will be found to descend from Neanderthals or Denisovans.

Now that we know when those haplogroups occurred, how did our ancestors get from Africa 232,000 years ago to Kent, England, in the 1400s? What path did they take?

The new Globetrekker tool answers that question.

The Breadcrumb Trail

In Globetrekker, each haplogroup’s location is placed by a combination of testers’ results, their identified earliest known ancestor (EKA) country and location, combined with ancient samples, climatic factors like glaciers and sea levels, and geographic features. You can read about Globetrekker here and here.

To view the Globetrekker tool, you must sign it to an account that has taken the Big Y test. It’s a tool exclusively provided for Big-Y testers.

You can click at the bottom of your Globetrekker map to play the animated video.

Beginning in Africa, our ancestors began their journey with Y-Adam, then migrated through the Near East, South Asia, East Asia, then west through central Asia into Europe. The Estes ancestors crossed the English Channel and migrated around what is now England before settling in Deal, on the east coast.

Clicking on any haplogroup provides a description of that haplogroup and how it was placed in that location.

Enabling the option for ancient DNA shows those locations as well, near the haplogroups they represent when the animation is playing.

Clicking on the shovel icon explains about that particular ancient DNA sample, what is known, and how it relates to the haplogroup it’s connected to by a dotted line on the map.

Pretty cool, huh!!

End to End

As you can see from this example, Big Y results are an end-to-end tool.

We can use the Big Y-700 haplogroups very successfully for recent genealogy – assigning testers to specific lines in a genealogy timeframe. Some haplogroups are so specific that, without additional information, we can place a man in his exact generation, or within a generation or two.

Not shown in my Estes pedigree chart is an adoptee with a different surname, of course. We know that he descends from Moses’s line because he carries haplogroup R-ZS3700, but we are still working on the more recent generations using autosomal DNA to connect him accurately.  If more of Moses’s descendants tested, we could probably place him very specifically. Without the Big Y-700 test, he wouldn’t know his biological surname or that he descends from Moses. That’s a HUGE breakthrough for him.

There’s more about the Estes line to learn, however.

If our Estes cousins tested their brothers, uncles or other Estes males in their line, they would likely receive a more refined haplogroup that’s relevant only to that line.

Using Big-Y test results, we can place men within a couple of generations and identify a common ancestor, even when all men within a haplogroup don’t know their genealogical lineage. Using those same test results, we can follow the breadcrumbs all 50 steps back in time more than 230,000 years to Y-Adam.

End to end, the Big-Y test coupled with breadcrumbs in Discover, Globetrekker, and other amazing tools is absolutely the most informative and powerful test available to male testers for their paternal line genealogy.

These amazing innovations tracking more than 50,000 haplogroups across the globe answer the original questions about how we know.

The more people who take or upgrade to the Big Y-700 test, the more haplogroup branches will be added, and the more refined the breadcrumbs, ages, and maps will become. In other words, there’s still more to learn.

Test if you haven’t, and check back often for new matches and breadcrumbs, aka updates.

_____________________________________________________________

Follow DNAexplain on Facebook, here.

Share the Love!

You’re always welcome to forward articles or links to friends and share on social media.

If you haven’t already subscribed (it’s free,) you can receive an email whenever I publish by clicking the “follow” button on the main blog page, here.

You Can Help Keep This Blog Free

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Uploads

Genealogy Products and Services

My Book

 

Genealogy Books

  • com – Lots of wonderful genealogy research books
  • American Ancestors – Wonderful selection of genealogy books

Genealogy Research

 

 

22 thoughts on “Haplogroups: DNA SNPs Are Breadcrumbs – Follow Their Path

  1. Thank you so much for this article. I just ordered a Big Y-700 for a Clemons, Clement, etc. man, but had no idea how to use the results when I get them.

  2. Thank you for the very detailed and clarifyiing explaination…….!!!!
    Assuming that these STR counts are being done “automatically” by machines, and each sample needs to be scanned at least 5 times to comfirm correct designation, are the results then reviewed by live personnel for any glitches or inconsistancies before final assignment? In the interest of expectations for results from testors, What is the approx. time element for all the steps in each individual sample, to happen before the testor is notified?

    • Actually, each sample is scanned about 30 times, sometimes more. To confirm a mutation at a specific location, there have to be five of those SNP scans, minimally, to be considered a “positive” result. If the sample matches existing testers, with no new haplogroup, I don’t think it’s reviewed unless it’s kicked out for quality control issues. However, if a new haplogroup is found, yes, it’s manually reviewed. I think it’s 6-8 weeks if no step has to be rerun, but sometimes it’s longer. And if the sample fails, they have to get a new sample and start over.

  3. As a descendant of Moses’s brother John. I find this fascinating. Unfortunately, my most recent Estes ancestor was my great grandmother, who only had one brother who only had daughters, so we can’t really help much.

  4. I think it important to point out that for a SNP to show up in a son’s Y chromosome, the mutation must have occurred in the father’s germline during spermatogenesis. We all experience mutations in the millions of cells in our bodies every day, including in the cells’ Y chromosome which is in every cell in a man’s body; however, the mutation rate in the germline is significantly lower than in all the other somatic cells, but does increase with age. Thus, an older man having a son has an increased chance of passing on a mutated Y chromosome. This is why, although rare, one may see two different SNPs in two brothers, both different from the father’s haplogroup. It would appear then that the longer a man waits to have a son, the greater the chance of a SNP occurring between the generations. The same holds true for STRs, likely more so.

  5. Thanks Roberta for another great article! I was inspired to finally order the Big-Y as a result (upgrading my STR 67) . . . can’t wait to see what they come up with and hopefully the data will be useful for the Estes project. Have a great rest of your Summer!

  6. Question – Why is the man who is the most recent common ancestor of my haplogroup line (R-FT121111) is estimated to have been born around 994 CE – approximately 1000 years ago with no mutations? One other person tested positive for the same haplogroup as mine. On average, how often do haplogroups mutate? I am guessing that more persons need to test to show mutations along the parent haplogroup (R-FT121111).

  7. Roberta, how do you come up with 700,000 “locations” (aka SNP’s) in your example? Autosomal DNA has over 3 Billion SNP’s, not 700,000, so your example of cutting segments in half (which is also not what happens in nature but I guess you wanted to simplify things for the reader) is limping.

    • That’s an average of what the testing companies use, which is what’s relevant for genetic genealogy.

  8. Great read Roberta — I’ve done my BigY700 as well as sponsored several more — over 20 years ago I discovered my GGF was born out of wedlock in 1877 and his parents did not marry (discovered this through my research and it was covered up pretty good) – so my YDNA doesn’t go along with my surname (actually paternal Bailey). About 10+ years ago I started noticing autosomal connections to some other Bryan/t lines (My paternal GGGM line). At first thought there were cousins on my own branch I did not know of, but it became obvious it was different lines all together — over the years it turned into 4 different lines (2 Bryant and 2 Bryan). Since my Y would not help I sponsored YDNA to all 4 lines some lower but at least one at the BigY700 level. My limited experience with triangulation (FTDNA tool) looked like we all matched on at least one segment. At 111 markers it appeared that the shared ancestor would be my 5th or 6th GGF (my brick wall on that line is my 5th GGF) — But the BigY700 showed 3 with same exact Haplo while the 4th looks to share an ancestor further upstream (not too far). I find this odd as the amount of autosomal shared with all lines (not my own) are about the same amount and I’ve found evidence that one of the lines married a neighbor’s daughter in 1830, so there was interaction. These lines show different birth places and died in different locations. I can fudge in ThruLines (Ancestry) and put the oldest known ancestor of the line not the same Y and I get 38 autosomal connections and they all check out legit (if that Bryan was actually in my line) — would love for someone smarter than me to put eyes on this and give me their feedback. Thanks — Gene Bryant

  9. It’s so good FTDNA is now providing this for their Y-700 customers.
    Way back when I first tested my Y (as 67 STRs with later single SNP confirmation of my Haplogroup), I was in a very rare group and had to do a lot of research on my own, even after some great help from Project admins.
    It was so rewarding. I have a lifetime of documentary research and still found it hard – good luck to anyone else! So the new Y-700 Discover resources are wonderful.
    I can still see people asking, but what can it do for me?
    So it is great to have your post answering that.
    [I look forward to a Discover equivalent for mtDNA!]

  10. Your explanation and graphics are perfect. This brought it together for me, and I am already using it for my brother’s YDNA. Thank you for the article.

  11. In keeping with your advice some many columns ago, I’m getting ready a Gedcom to upload to WikiTree. The Rootsmagic program has a space for chromosome haplogroups, I’ve got a fair number of y-chromosome and mtDNA tests results to report, and, as you say, here, those haplogroups change. It would be accurate to fill in the blanks only for the person who took the test. Do you think it would be more helpful other researchers to run the haplogroups back? My mtDNA haplogroup is H1cj. Should I put it, do you think, into the blanks for my matrilineal ancestors back to the C17? My y–chromosome haplogroup is good back to my great-great-great-grandfather because I share it with another of his descendants, but how about the earlier six or seven patrilineal generations? Then carrying haplogroups down. My mtDNA haplogroup should be good for my sisters and brother. How about my second cousins who derive it as I did from my great-grandmother? Guidance and counsel welcome!

    Scott Swanson

    • You know, that’s a great question. I have no idea, but I know who to ask. I’m going to ask someone from WikiTree if they will answer this comment. I’m not positive exactly how that works, in terms of upstream population, especially if there’s already a haplogroup there.

    • I don’t know where the answer went that she entered, so I’m copy pasting and summarizing. Yes, if you enter the haplogroup, it gets populated up your tree. I don’t know what happens when two “collide” or how that’s resolved, and she didn’t answer that.

      She said “The answer is, you can add your halpogroup when you enter your DNA test information. Be sure to keep it up to date if it changes. https://www.wikitree.com/wiki/Help:How_to_Get_Started_with_DNA

      My suggestion would be to contact the WikiTree team at info@wikitree.com and ask.

      • Thanks, Roberta. I did ask a few days ago in getting my Gedcom ready, and the WikiTree policy is that one adds DNA information only to the person who took the test. One might also add the information to the profiles of testers who are now dead. In any case it is a moot point at the moment. I spent months preparing a Gedcom for WikiTree — 32000 people — only to discover when I went to upload it this afternoon that WikiTree has a Gedcom limit of 5000 people. (Not clear whether that’s per Gedcom.) I’m going to try to find whom at WikiTree to write to put this information up front for newbies instead of the exciting unqualified invitation to upload your Gedcom. This after clicking box after box for three days to exclude living people.

  12. Dear Roberta, after a series of questions and answers on the G2G about Gedcoms, a series of kindly first responders told me that the limitation on Gedcoms to 5000 people was to be found in the Gedcom help file. Since I know about Gedcoms, how to make them, how to upload them, and the like, it would not have occurred to me to look in that ‘help’ file. I would have looked into a ‘requirements’ file. But nothing I have encountered suggested uploading a Gedcom to WikiTree was any different than uploading such a file to the other family history sites. I am about to write WikiTree to ask them to put that limitation/requirement up front. Since you love this site, when you next talk to people on the blog about joining it, you might well showcase that particular feature so other people do not hit the brickwall I just did. Thanks again for your sleuthing on my behalf. (I’m also going to suggest that WikiTree add a marriage date to its search engine. Of late I’m running into a fair number of people with a blurry birth range and sometimes no death at all but a quite precise marriage date.)

    • It’s been a long time since I uploaded a Gedcom there. If I recall, the idea is to connect to your first ancestor in their tree. I don’t know if it also updates information for ancestors from your file, or goes down branches and adds descendants. I hope that made sense.

Leave a Reply