Genetic Genealogy at 20 Years: Where Have We Been, Where Are We Going and What’s Important?

Not only have we put 2020 in the rear-view mirror, thankfully, we’re at the 20-year, two-decade milestone. The point at which genetics was first added to the toolbox of genealogists.

It seems both like yesterday and forever ago. And yes, I’ve been here the whole time,  as a spectator, researcher, and active participant.

Let’s put this in perspective. On New Year’s Eve, right at midnight, in 2005, I was able to score kit number 50,000 at Family Tree DNA. I remember this because it seemed like such a bizarre thing to be doing at midnight on New Year’s Eve. But hey, we genealogists are what we are.

I knew that momentous kit number which seemed just HUGE at the time was on the threshold of being sold, because I had inadvertently purchased kit 49,997 a few minutes earlier.

Somehow kit 50,000 seemed like such a huge milestone, a landmark – so I quickly bought kits, 49,998, 49,999, and then…would I get it…YES…kit 50,000. Score!

That meant that in the 5 years FamilyTreeDNA had been in business, they had sold on an average of 10,000 kits per year, or 27 kits a day. Today, that’s a rounding error. Then it was momentous!

In reality, the sales were ramping up quickly, because very few kits were sold in 2000, and roughly 20,000 kits had been sold in 2005 alone. I know this because I purchased kit 28,429 during the holiday sale a year earlier.

Of course, I had no idea who I’d test with that momentous New Year’s Eve Y DNA kit, but I assuredly would find someone. A few months later, I embarked on a road trip to visit an elderly family member with that kit in tow. Thank goodness I did, and they agreed and swabbed on the spot, because they are gone today and with them, the story of the Y line and autosomal DNA of their branch.

In the past two decades, almost an entire generation has slipped away, and with them, an entire genealogical library held in their DNA.

Today, more than 40 million people have tested with the four major DNA testing companies, although we don’t know exactly how many.

Lots of people have had more time to focus on genealogy in 2020, so let’s take a look at what’s important? What’s going on and what matters beyond this month or year?

How has this industry changed in the last two decades, and where it is going?

Reflection

This seems like a good point to reflect a bit.

Professor Dan Bradley reflecting on early genetic research techniques in his lab at the Smurfit Institute of Genetics at Trinity College in Dublin. Photo by Roberta Estes

In the beginning – twenty years ago, there were two companies who stuck their toes in the consumer DNA testing water – Oxford Ancestors and Family Tree DNA. About the same time, Sorenson Genomics and GeneTree were also entering that space, although Sorenson was a nonprofit. Today, of those, only FamilyTreeDNA remains, having adapted with the changing times – adding more products, testing, and sophistication.

Bryan Sykes who founded Oxford Ancestors announced in 2018 that he was retiring to live abroad and subsequently passed away in 2020. The website still exists, but the company has announced that they have ceased sales and the database will remain open until Sept 30, 2021.

James Sorenson died in 2008 and the assets of Sorenson Molecular Genealogy Foundation, including the Sorenson database, were sold to Ancestry in 2012. Eventually, Ancestry removed the public database in 2015.

Ancestry dabbled in Y and mtDNA for a while, too, destroying that database in 2014.

Other companies, too many to remember or mention, have come and gone as well. Some of the various company names have been recycled or purchased, but aren’t the same companies today.

In the DNA space, it was keep up, change, die or be sold. Of course, there was the small matter of being able to sell enough DNA kits to make enough money to stay in business at all. DNA processing equipment and a lab are expensive. Not just the equipment, but also the expertise.

The Next Wave

As time moved forward, new players entered the landscape, comprising the “Big 4” testing companies that constitute the ponds where genealogists fish today.

23andMe was the first to introduce autosomal DNA testing and matching. Their goal and focus was always medical genetics, but they recognized the potential in genealogists before anyone else, and we flocked to purchase tests.

Ancestry settled on autosomal only and relies on the size of their database, a large body of genealogy subscribers, and a widespread “feel-good” marketing campaign to sell DNA kits as the gateway to “discover who you are.”

FamilyTreeDNA did and still does offer all 3 kinds of tests. Over the years, they have enhanced both the Y DNA and mitochondrial product offerings significantly and are still known as “the science company.” They are the only company to offer the full range of Y DNA tests, including their flagship Big Y-700, full sequence mitochondrial testing along with matching for both products. Their autosomal product is called Family Finder.

MyHeritage entered the DNA testing space a few years after the others as the dark horse that few expected to be successful – but they fooled everyone. They have acquired companies and partnered along the way which allowed them to add customers (Promethease) and tools (such as AutoCluster by Genetic Affairs), boosting their number of users. Of course, MyHeritage also offers users a records research subscription service that you can try for free.

In summary:

One of the wonderful things that happened was that some vendors began to accept compatible raw DNA autosomal data transfer files from other vendors. Today, FamilyTreeDNA, MyHeritage, and GEDmatch DO accept transfer files, while Ancestry and 23andMe do not.

The transfers and matching are free, but there are either minimal unlock or subscription plans for advanced features.

There are other testing companies, some with niche markets and others not so reputable. For this article, I’m focusing on the primary DNA testing companies that are useful for genealogy and mainstream companion third-party tools that complement and enhance those services.

The Single Biggest Change

As I look back, the single biggest change is that genetic genealogy evolved from the pariah of genealogy where DNA discussion was banned from the (now defunct) Rootsweb lists and summarily deleted for the first few years after introduction. I know, that’s hard to believe today.

Why, you ask?

Reasons varied from “just because” to “DNA is cheating” and then morphed into “because DNA might do terrible things like, maybe, suggest that a person really wasn’t related to an ancestor in a lineage society.”

Bottom line – fear and misunderstanding. Change is exceedingly difficult for humans, and DNA definitely moved the genealogy cheese.

From that awkward beginning, genetic genealogy organically became a “thing,” a specific application of genealogy. There was paper-trail traditional genealogy and then the genetic aspect. Today, for almost everyone, genealogy is “just another tool” in the genealogist’s toolbox, although it does require focused learning, just like any other tool.

DNA isn’t separate anymore, but is now an integral part of the genealogical whole. Having said that, DNA can’t solve all problems or answer all questions, but neither can traditional paper-trail genealogy. Together, each makes the other stronger and solves mysteries that neither can resolve alone.

Synergy.

I fully believe that we have still only scratched the surface of what’s possible.

Inheritance

As we talk about the various types of DNA testing and tools, here’s a quick graphic to remind you of how the different types of DNA are inherited.

  • Y DNA is inherited paternally for males only and informs us of the direct patrilineal (surname) line.
  • Mitochondrial DNA is inherited by everyone from their mothers and informs us of the mother’s matrilineal (mother’s mother’s mother’s) line.
  • Autosomal DNA can be inherited from potentially any ancestor in random but somewhat predictable amounts through both parents. The further back in time, the less identifiable DNA you’ll inherit from any specific ancestor. I wrote about that, here.

What’s Hot and What’s Not

Where should we be focused today and where is this industry going? What tools and articles popped up in 2020 to help further our genealogy addiction? I already published the most popular articles of 2020, here.

This industry started two decades ago with testing a few Y DNA and mitochondrial DNA markers, and we were utterly thrilled at the time. Both tests have advanced significantly and the prices have dropped like a stone. My first mitochondrial DNA test that tested only 400 locations cost more than $800 – back then.

Y DNA and mitochondrial DNA are still critically important to genetic genealogy. Both play unique roles and provide information that cannot be obtained through autosomal DNA testing. Today, relative to Y DNA and mitochondrial DNA, the biggest challenge, ironically, is educating newer genealogists about their potential who have never heard about anything other than autosomal, often ethnicity, testing.

We have to educate in order to overcome the cacophony of “don’t bother because you don’t get as many matches.”

That’s like saying “don’t use the right size wrench because the last one didn’t fit and it’s a bother to reach into the toolbox.” Not to mention that if everyone tested, there would be a lot more matches, but I digress.

If you don’t use the right tool, and all of the tools at your disposal, you’re not going to get the best result possible.

The genealogical proof standard, the gold standard for genealogy research, calls for “a reasonably exhaustive search,” and if you haven’t at least considered if or how Y
DNA
and mitochondrial DNA along with autosomal testing can or might help, then your search is not yet exhaustive.

I attempt to obtain the Y and mitochondrial DNA of every ancestral line. In the article, Search Techniques for Y and Mitochondrial DNA Test Candidates, I described several methodologies to find appropriate testing candidates.

Y DNA – 20 Years and Still Critically Important

Y DNA tracks the Y chromosome for males via the patrilineal (surname) line, providing matching and historical migration information.

We started 20 years ago testing 10 STR markers. Today, we begin at 37 markers, can upgrade to 67 or 111, but the preferred test is the Big Y which provides results for 700+ STR markers plus results from the entire gold standard region of the Y chromosome in order to provide the most refined results. This allows genealogists to use STR markers and SNP results together for various aspects of genealogy.

I created a Y DNA resource page, here, in order to provide a repository for Y DNA information and updates in one place. I would encourage anyone who can to order or upgrade to the Big Y-700 test which provides critical lineage information in addition to and beyond traditional STR testing. Additionally, the Big Y-700 test helps build the Y DNA haplotree which is growing by leaps and bounds.

More new SNPs are found and named EVERY SINGLE DAY today at FamilyTreeDNA than were named in the first several years combined. The 2006 SNP tree listed a grand total of 459 SNPs that defined the Y DNA tree at that time, according to the ISOGG Y DNA SNP tree. Goran Rundfeldt, head of R&D at FamilyTreeDNA posted this today:

2020 was an awful year in so many ways, but it was an unprecedented year for human paternal phylogenetic tree reconstruction. The FTDNA Haplotree or Great Tree of Mankind now includes:

37,534 branches with 12,696 added since 2019 – 51% growth!
defined by
349,097 SNPs with 131,820 added since 2019 – 61% growth!

In just one year, 207,536 SNPs were discovered and assigned FT SNP names. These SNPs will help define new branches and refine existing ones in the future.

The tree is constructed based on high coverage chromosome Y sequences from:
– More than 52,500 Big Y results
– Almost 4,000 NGS results from present-day anonymous men that participated in academic studies

Plus an additional 3,000 ancient DNA results from archaeological remains, of mixed quality and Y chromosome coverage at FamilyTreeDNA.

Wow, just wow.

These three new articles in 2020 will get you started on your Y DNA journey!

Mitochondrial DNA – Matrilineal Line of Humankind is Being Rewritten

The original Oxford Ancestor’s mitochondrial DNA test tested 400 locations. The original Family Tree DNA test tested around 1000 locations. Today, the full sequence mitochondrial DNA test is standard, testing the entire 16,569 locations of the mitochondria.

Mitochondrial DNA tracks your mother’s direct maternal, or matrilineal line. I’ve created a mitochondrial DNA resource page, here that includes easy step-by-step instructions for after you receive your results.

New articles in 2020 included the introduction of The Million Mito Project. 2021 should see the first results – including a paper currently in the works.

The Million Mito Project is rewriting the haplotree of womankind. The current haplotree has expanded substantially since the first handful of haplogroups thanks to thousands upon thousands of testers, but there is so much more information that can be extracted today.

Y and Mitochondrial Resources

If you don’t know of someone in your family to test for Y DNA or mitochondrial DNA for a specific ancestral line, you can always turn to the Y DNA projects at Family Tree DNA by searching here.

The search provides you with a list of projects available for a specific surname along with how many customers with that surname have tested. Looking at the individual Y DNA projects will show the earliest known ancestor of the surname line.

Another resource, WikiTree lists people who have tested for the Y DNA, mitochondrial DNA and autosomal DNA lines of specific ancestors.

Click on images to enlarge

On the left side, my maternal great-grandmother’s profile card, and on the right, my paternal great-great-grandfather. You can see that someone has tested for the mitochondrial DNA of Nora (OK, so it’s me) and the Y DNA of John Estes (definitely not me.)

MitoYDNA, a nonprofit volunteer organization created a comparison tool to replace Ysearch and Mitosearch when they bit the dust thanks to GDPR.

MitoYDNA accepts uploads from different sources and allows uploaders to not only match to each other, but to view the STR values for Y DNA and the mutation locations for the HVR1 and HVR2 regions of mitochondrial DNA. Mags Gaulden, one of the founders, explains in her article, What sets mitoYDNA apart from other DNA Databases?.

If you’ve tested at nonstandard companies, not realizing that they didn’t provide matching, or if you’ve tested at a company like Sorenson, Ancestry, and now Oxford Ancestors that is going out of business, uploading your results to mitoYDNA is a way to preserve your investment. PS – I still recommend testing at FamilyTreeDNA in order to receive detailed results and compare in their large database.

CentiMorgans – The Word of Two Decades

The world of autosomal DNA turns on the centimorgan (cM) measure. What is a centimorgan, exactly? I wrote about that unit of measure in the article Concepts – CentiMorgans, SNPs and Pickin’ Crab.

Fortunately, new tools and techniques make using cMs much easier. The Shared cM Project was updated this year, and the results incorporated into a wonderfully easy tool used to determine potential relationships at DNAPainter based on the number of shared centiMorgans.

Match quality and potential relationships are determined by the number of shared cMs, and the chromosome browser is the best tool to use for those comparisons.

Chromosome Browser – Genetics Tool to View Chromosome Matches

Chromosome browsers allow testers to view their matching cMs of DNA with other testers positioned on their own chromosomes.

My two cousins’ DNA where they match me on chromosomes 1-4, is shown above in blue and red at Family Tree DNA. It’s important to know where you match cousins, because if you match multiple cousins on the same segment, from the same side of your family (maternal or paternal), that’s suggestive of a common ancestor, with a few caveats.

Some people feel that a chromosome browser is an advanced tool, but I think it’s simply standard fare – kind of like driving a car. You need to learn how to drive initially, but after that, you don’t even think about it – you just get in and go. Here’s help learning how to drive that chromosome browser.

Triangulation – Science Plus Group DNA Matching Confirms Genealogy

The next logical step after learning to use a chromosome browser is triangulation. If fact, you’re seeing triangulation above, but don’t even realize it.

The purpose of genetic genealogy is to gather evidence to “prove” ancestral connections to either people or specific ancestors. In autosomal DNA, triangulation occurs when:

  • You match at least two other people (not close relatives)
  • On the same reasonably sized segment of DNA (generally 7 cM or greater)
  • And you can assign that segment to a common ancestor

The same two cousins are shown above, with triangulated segments bracketed at MyHeritage. I’ve identified the common ancestor with those cousins that those matching DNA segments descend from.

MyHeritage’s triangulation tool confirms by bracketing that these cousins also match each other on the same segment, which is the definition of triangulation.

I’ve written a lot about triangulation recently.

If you’d prefer a video, I recorded a “Top Tips” Facebook LIVE with MyHeritage.

Why is Ancestry missing from this list of triangulation articles? Ancestry does not offer a chromosome browser or segment information. Therefore, you can’t triangulate at Ancestry. You can, however, transfer your Ancestry DNA raw data file to either FamilyTreeDNA, MyHeritage, or GEDmatch, all three of which offer triangulation.

Step by step download/upload transfer instructions are found in this article:

Clustering Matches and Correlating Trees

Based on what we’ve seen over the past few years, we can no longer depend on the major vendors to provide all of the tools that genealogists want and need.

Of course, I would encourage you to stay with mainstream products being used by a significant number of community power users. As with anything, there is always someone out there that’s less than honorable.

2020 saw a lot of innovation and new tools introduced. Maybe that’s one good thing resulting from people being cooped up at home.

Third-party tools are making a huge difference in the world of genetic genealogy. My favorites are Genetic Affairs, their AutoCluster tool shown above, DNAPainter and DNAGedcom.

These articles should get you started with clustering.

If you like video resources, here’s a MyHeritage Facebook LIVE that I recorded about how to use AutoClusters:

I created a compiled resource article for your convenience, here:

I have not tried a newer tool, YourDNAFamily, that focuses only on 23andMe results although the creator has been a member of the genetic genealogy community for a long time.

Painting DNA Makes Chromosome Browsers and Triangulation Easy

DNAPainter takes the next step, providing a repository for all of your painted segments. In other words, DNAPainter is both a solution and a methodology for mass triangulation across all of your chromosomes.

Here’s a small group of people who match me on the same maternal segment of chromosome 1, including those two cousins in the chromosome browser and triangulation sections, above. We know that this segment descends from Philip Jacob Miller and his wife because we’ve been able to identify that couple as the most distant ancestor intersection in all of our trees.

It’s very helpful that DNAPainter has added the functionality of painting all of the maternal and paternal bucketed matches from Family Tree DNA.

All you need to do is to link your known matches to your tree in the proper place at FamilyTreeDNA, then they do the rest by using those DNA matches to indicate which of the rest of your matches are maternal and paternal. Instructions, here. You can then export the file and use it at DNAPainter to paint all of those matches on the correct maternal or paternal chromosomes.

Here’s an article providing all of the DNAPainter Instructions and Resources.

DNA Matches Plus Trees Enhance Genealogy

Of course, utilizing DNA matching plus finding common ancestors in trees is one of the primary purposes of genetic genealogy – right?

Vendors have linked the steps of matching DNA with matching ancestors in trees.

Genetic Affairs take this a step further. If you don’t have an ancestor in your tree, but your matches have common ancestors with each other, Genetic Affairs assembles those trees to provide you with those hints. Of course, that common ancestor might not be relevant to your genealogy, but it just might be too!

click to enlarge

This tree does not include me, but two of my matches descend from a common ancestor and that common ancestor between them might be a clue as to why I match both of them.

Ethnicity Continues to be Popular – But Is No Shortcut to Genealogy

Ethnicity is always popular. People want to “do their DNA” and find out where they come from. I understand. I really do. Who doesn’t just want an answer?

Of course, it’s not that simple, but that doesn’t mean it’s not disappointing to people who test for that purpose with high expectations. Hopefully, ethnicity will pique their curiosity and encourage engagement.

All four major vendors rolled out updated ethnicity results or related tools in 2020.

The future for ethnicity, I believe, will be held in integrated tools that allow us to use ethnicity results for genealogy, including being able to paint our ethnicity on our chromosomes as well as perform segment matching by ethnicity.

For example, if I carry an African segment on chromosome 1 from my father, and I match one person from my mother’s side and one from my father’s side on that same segment – one or the other of those people should also have that segment identified as African. That information would inform me as to which match is paternal and which is maternal

Not only that, this feature would help immensely tracking ancestors back in time and identifying their origins.

Will we ever get there? I don’t know. I’m not sure ethnicity is or can be accurate enough. We’ll see.

Transition to Digital and Online

Sometimes the future drags us kicking and screaming from the present.

With the imposed isolation of 2020, conferences quickly moved to an online presence. The genealogy community has all pulled together to make this work. The joke is that 2020’s most used phrase is “can you hear me?” I can vouch for that.

Of course while the year 2020 is over, the problem isn’t and is extending at least through the first half of 2021 and possibly longer. Conferences are planned months, up to a year, in advance and they can’t turn on a dime, so don’t even begin to expect in-person conferences until either late in 2021 or more likely, 2022 if all goes well this year.

I expect the future will eventually return to in-person conferences, but not entirely.

Finding ways to be more inclusive allows people who don’t want to or can’t travel or join in-person to participate.

I’ve recorded several sessions this year, mostly for 2021. Trust me, these could be a comedy, mostly of errors😊

I participated in four MyHeritage Facebook LIVE sessions in 2020 along with some other amazing speakers. This is what “live” events look like today!

Screenshot courtesy MyHeritage

A few days ago, I asked MyHeritage for a list of their LIVE sessions in 2020 and was shocked to learn that there were more than 90 in English, all free, and you can watch them anytime. Here’s the MyHeritage list.

By the way, every single one of the speakers is a volunteer, so say a big thank you to the speakers who make this possible, and to MyHeritage for the resources to make this free for everyone. If you’ve ever tried to coordinate anything like this, it’s anything but easy.

Additonally, I’ve created two Webinars this year for Legacy Family Tree Webinars.

Geoff Rasmussen put together the list of their top webinars for 2020, and I was pleased to see that I made the top 10! I’m sure there are MANY MORE you’d be interested in watching. Personally, I’m going to watch #6 yet today! Also, #9 and #22. You can always watch new webinars for free for a few days, and you can subscribe to watch all webinars, here.

The 2021 list of webinar speakers has been announced here, and while I’m not allowed to talk about something really fun that’s upcoming, let’s just say you definitely have something to look forward to in the springtime!

Also, don’t forget to register for RootsTech Connect which is entirely online and completely free, February 25-27, here.

Thank you to Penny Walters for creating this lovely graphic.

There are literally hundreds of speakers providing sessions in many languages for viewers around the world. I’ve heard the stats, but we can’t share them yet. Let me just say that you will be SHOCKED at the magnitude and reach of this conference. I’m talking dumbstruck!

During one of our zoom calls, one of the organizers says it feels like we’re constructing the plane as we’re flying, and I can confirm his observation – but we are getting it done – together! All hands on deck.

I’ll be presenting an advanced session about triangulation as well as a mini-session in the FamilySearch DNA Resource Center about finding your mother’s ancestors. I’ll share more information as it’s released and I can.

Companies and Owners Come & Go

You probably didn’t even notice some of these 2020 changes. Aside from the death of Bryan Sykes (RIP Bryan,) the big news and the even bigger unknown is the acquisition of Ancestry by Blackstone. Recently the CEO, Margo Georgiadis announced that she was stepping down. The Ancestry Board of Directors has announced an external search for a new CEO. All I can say is that very high on the priority list should be someone who IS a genealogist and who understands how DNA applies to genealogy.

Other changes included:

In the future, as genealogy and DNA testing becomes ever more popular and even more of a commodity, company sales and acquisitions will become more commonplace.

Some Companies Reduced Services and Cut Staff

I understand this too, but it’s painful. The layoffs occurred before Covid, so they didn’t result from Covid-related sales reductions. Let’s hope we see renewed investment after the Covid mess is over.

In a move that may or may not be related to an attempt to cut costs, Ancestry removed 6 and 7 cM matches from their users, freeing up processing resources, hardware, and storage requirements and thereby reducing costs.

I’m not going to beat this dead horse, because Ancestry is clearly not going to move on this issue, nor on that of the much-requested chromosome browser.

Later in the year, 23andMe also removed matches and other features, although, to their credit, they have restored at least part of this functionality and have provided ethnicity updates to V3 and V4 kits which wasn’t initially planned.

It’s also worth noting that early in 2020, 23andMe laid off 100 people as sales declined. Since that time, 23andMe has increasingly pushed consumers to pay to retest on their V5 chip.

About the same time, Ancestry also cut their workforce by about 6%, or about 100 people, also citing a slowdown in the consumer testing market. Ancestry also added a health product.

I’m not sure if we’ve reached market saturation or are simply seeing a leveling off. I wrote about that in DNA Testing Sales Decline: Reason and Reasons.

Of course, the pandemic economy where many people are either unemployed or insecure about their future isn’t helping.

The various companies need some product diversity to survive downturns. 23andMe is focused on medical research with partners who pay 23andMe for the DNA data of customers who opt-in, as does Ancestry.

Both Ancestry and MyHeritage provide subscription services for genealogy records.

FamilyTreeDNA is part of a larger company, GenebyGene whose genetics labs do processing for other companies and medical facilities.

A huge thank you to both MyHeritage and FamilyTreeDNA for NOT reducing services to customers in 2020.

Scientific Research Still Critical & Pushes Frontiers

Now that DNA testing has become a commodity, it’s easy to lose track of the fact that DNA testing is still a scientific endeavor that requires research to continue to move forward.

I’m still passionate about research after 20 years – maybe even more so now because there’s so much promise.

Research bleeds over into the consumer marketplace where products are improved and new features created allowing us to better track and understand our ancestors through their DNA that we and our family members inherit.

Here are a few of the research articles I published in 2020. You might notice a theme here – ancient DNA. What we can learn now due to new processing techniques is absolutely amazing. Labs can share files and information, providing the ability to “reprocess” the data, not the DNA itself, as more information and expertise becomes available.

Of course, in addition to this research, the Million Mito Project team is hard at work rewriting the tree of womankind.

If you’d like to participate, all you need to do is to either purchase a full sequence mitochondrial DNA kit at FamilyTreeDNA, or upgrade to the full sequence if you tested at a lower level previously.

Predictions

Predictions are risky business, but let me give it a shot.

Looking back a year, Covid wasn’t on the radar.

Looking back 5 years, neither Genetic Affairs nor DNAPainter were yet on the scene. DNAAdoption had just been formed in 2014 and DNAGedcom which was born out of DNAAdoption didn’t yet exist.

In other words, the most popular tools today didn’t exist yet.

GEDmatch, founded in 2010 by genealogists for genealogists was 5 years old, but was sold in December 2019 to Verogen.

We were begging Ancestry for a chromosome browser, and while we’ve pretty much given up beating them, because the horse is dead and they can sell DNA kits through ads focused elsewhere, that doesn’t mean genealogists still don’t need/want chromosome and segment based tools. Why, you’d think that Ancestry really doesn’t want us to break through those brick walls. That would be very bizarre, because every brick wall that falls reveals two more ancestors that need to be researched and spurs a frantic flurry of midnight searching. If you’re laughing right now, you know exactly what I mean!

Of course, if Ancestry provided a chromosome browser, it would cost development money for no additional revenue and their customer service reps would have to be able to support it. So from Ancestry’s perspective, there’s no good reason to provide us with that tool when they can sell kits without it. (Sigh.)

I’m not surprised by the management shift at Ancestry, and I wouldn’t be surprised to see several big players go public in the next decade, if not the next five years.

As companies increase in value, the number of private individuals who could afford to purchase the company decreases quickly, leaving private corporations as the only potential buyers, or becoming publicly held. Sometimes, that’s a good thing because investment dollars are infused into new product development.

What we desperately need, and I predict will happen one way or another is a marriage of individual tools and functions that exist separately today, with a dash of innovation. We need tools that will move beyond confirming existing ancestors – and will be able to identify ancestors through our DNA – out beyond each and every brick wall.

If a tester’s DNA matches to multiple people in a group descended from a particular previously unknown couple, and the timing and geography fits as well, that provides genealogical researchers with the hint they need to begin excavating the traditional records, looking for a connection.

In fact, this is exactly what happened with mitochondrial DNA – twice now. A match and a great deal of digging by one extremely persistent cousin resulting in identifying potential parents for a brick-wall ancestor. Autosomal DNA then confirmed that my DNA matched with 59 other individuals who descend from that couple through multiple children.

BUT, we couldn’t confirm those ancestors using autosomal DNA UNTIL WE HAD THE NAMES of the couple. DNA has the potential to reveal those names!

I wrote about that in Mitochondrial DNA Bulldozes Brick Wall and will be discussing it further in my RootsTech presentation.

The Challenge

We have most of the individual technology pieces today to get this done. Of course, the combined technological solution would require significant computing resources and processing power – just at the same time that vendors are desperately trying to pare costs to a minimum.

Some vendors simply aren’t interested, as I’ve already noted.

However, the winner, other than us genealogists, of course, will be the vendor who can either devise solutions or partner with others to create the right mix of tools that will combine matching, triangulation, and trees of your matches to each other, even if you don’t’ share a common ancestor.

We need to follow the DNA past the current end of the branch of our tree.

Each triangulated segment has an individual history that will lead not just to known ancestors, but to their unknown ancestors as well. We have reached critical mass in terms of how many people have tested – and more success would encourage more and more people to test.

There is a genetic path over every single brick wall in our genealogy.

Yes, I know that’s a bold statement. It’s not future Jetson’s flying-cars stuff. It’s doable – but it’s a matter of commitment, investment money, and finding a way to recoup that investment.

I don’t think it’s possible for the one-time purchase of a $39-$99 DNA test, especially when it’s not a loss-leader for something else like a records or data subscription (MyHeritage and Ancestry) or a medical research partnership (Ancestry and 23andMe.)

We’re performing these analysis processes manually and piecemeal today. It’s extremely inefficient and labor-intensive – which is why it often fails. People give up. And the process is painful, even when it does succeed.

This process has also been made increasingly difficult when some vendors block tools that help genealogists by downloading match and ancestral tree information. Before Ancestry closed access, I was creating theories based on common ancestors in my matches trees that weren’t in mine – then testing those theories both genetically (clusters, AutoTrees and ThruLines) and also by digging into traditional records to search for the genetic connection.

For example, I’m desperate to identify the parents of my James Lee Clarkson/Claxton, so I sorted my spreadsheet by surname and began evaluating everyone who had a Clarkson/Claxton in their tree in the 1700s in Virginia or North Carolina. But I can’t do that anymore now, either with a third-party tool or directly at Ancestry. Twenty million DNA kits sold for a minimum of $79 equals more than 1.5 billion dollars. Obviously, the issue here is not a lack of funds.

Including Y and mitochondrial DNA resources in our genetic toolbox not only confirms accuracy but also provides additional hints and clues.

Sometimes we start with Y DNA or mitochondrial DNA, and wind up using autosomal and sometimes the reverse. These are not competing products. It’s not either/or – it’s *and*.

Personally, I don’t expect the vendors to provide this game-changing complex functionality for free. I would be glad to pay for a subscription for top-of-the-line innovation and tools. In what other industry do consumers expect to pay for an item once and receive constant life-long innovations and upgrades? That doesn’t happen with software, phones nor with automobiles. I want vendors to be profitable so that they can invest in new tools that leverage the power of computing for genealogists to solve currently unsolvable problems.

Every single end-of-line ancestor in your tree represents a brick wall you need to overcome.

If you compare the cost of books, library visits, courthouse trips, and other research endeavors that often produce exactly nothing, these types of genetic tools would be both a godsend and an incredible value.

That’s it.

That’s the challenge, a gauntlet of sorts.

Who’s going to pick it up?

I can’t answer that question, but I can say that 23andMe can’t do this without supporting extensive trees, and Ancestry has shown absolutely no inclination to support segment data. You can’t achieve this goal without segment information or without trees.

Among the current players, that leaves two DNA testing companies and a few top-notch third parties as candidates – although – as the past has proven, the future is uncertain, fluid, and everchanging.

It will be interesting to see what I’m writing at the end of 2025, or maybe even at the end of 2021.

Stay tuned.

_____________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Products and Services

Genealogy Research

Books

Y DNA Resources and Repository

I’ve created a Y DNA resource page with the information in this article, here, as a permanent location where you can find Y DNA information in one place – including:

  • Step-by-step guides about how to utilize Y DNA for your genealogy
  • Educational articles and links to the latest webinars
  • Articles about the science behind Y DNA
  • Ancient DNA
  • Success stories

Please feel free to share this resource or any of the links to individual articles with friends, genealogy groups, or on social media.

If you haven’t already taken a Y DNA test, and you’re a male (only males have a Y chromosome,) you can order one here. If you also purchase the Family Finder, autosomal test, those results can be used to search together.

What is Y DNA?

Y DNA is passed directly from fathers to their sons, as illustrated by the blue arrow, above. Daughters do not inherit the Y chromosome. The Y chromosome is what makes males, male.

Every son receives a Y chromosome from his father, who received it from his father, and so forth, on up the direct patrilineal line.

Comparatively, mitochondrial DNA, the pink arrow, is received by both sexes of children from the mother through the direct matrilineal line.

Autosomal DNA, the green arrow, is a combination of randomly inherited DNA from many ancestors that is inherited by both sexes of children from both parents. This article explains a bit more.

Y DNA has Unique Properties

The Y chromosome is never admixed with DNA from the mother, so the Y chromosome that the son receives is identical to the father’s Y chromosome except for occasional minor mutations that take place every few generations.

This lack of mixture with the mother’s DNA plus the occasional mutation is what makes the Y chromosome similar enough to match against other men from the same ancestors for hundreds or thousands of years back in time, and different enough to be useful for genealogy. The mutations can be tracked within extended families.

In western cultures, the Y chromosome path of inheritance is usually the same as the surname, which means that the Y chromosome is uniquely positioned to identify the direct biological patrilineal lineage of males.

Two different types of Y DNA tests can be ordered that work together to refine Y DNA results and connect testers to other men with common ancestors.

FamilyTreeDNA provides STR tests with their 37, 67 and 111 marker test panels, and comprehensive STR plus SNP testing with their Big Y-700 test.

click to enlarge

STR markers are used for genealogy matching, while SNP markers work with STR markers to refine genealogy further, plus provide a detailed haplogroup.

Think of a haplogroup as a genetic clan that tells you which genetic family group you belong to – both today and historically, before the advent of surnames.

This article, What is a Haplogroup? explains the basic concept of how haplogroups are determined.

In addition to the Y DNA test itself, Family Tree DNA provides matching to other testers in their database plus a group of comprehensive tools, shown on the dashboard above, to help testers utilize their results to their fullest potential.

You can order or upgrade a Y DNA test, here. If you also purchase the Family Finder, autosomal test, those results can be used to search together.

Step-by-Step – Using Your Y DNA Results

Let’s take a look at all of the features, functions, and tools that are available on your FamilyTreeDNA personal page.

What do those words mean? Here you go!

Come along while I step through evaluating Big Y test results.

Big Y Testing and Results

Why would you want to take a Big Y test and how can it help you?

While the Big Y-500 has been superseded by the Big Y-700 test today, you will still be interested in some of the underlying technology. STR matching still works the same way.

The Big Y-500 provided more than 500 STR markers and the Big Y-700 provides more than 700 – both significantly more than the 111 panel. The only way to receive these additional markers is by purchasing the Big Y test.

I have to tell you – I was skeptical when the Big Y-700 was introduced as the next step above the Big Y-500. I almost didn’t upgrade any kits – but I’m so very glad that I did. I’m not skeptical anymore.

This Y DNA tree rocks. A new visual format with your matches listed on their branches. Take a look!

Educational Articles

I’ve been writing about DNA for years and have selected several articles that you may find useful.

What kinds of information are available if you take a Y DNA test, and how can you use it for genealogy?

What if your father isn’t available to take a DNA test? How can you determine who else to test that will reveal your father’s Y DNA information?

Family Tree DNA shows the difference in the number of mutations between two men as “genetic distance.” Learn what that means and how it’s figured in this article.

Of course, there were changes right after I published the original Genetic Distance article. The only guarantees in life are death, taxes, and that something will change immediately after you publish.

Sometimes when we take DNA tests, or others do, we discover the unexpected. That’s always a possibility. Here’s the story of my brother who wasn’t my biological brother. If you’d like to read more about Dave’s story, type “Dear Dave” into the search box on my blog. Read the articles in publication order, and not without a box of Kleenex.

Often, what surprise matches mean is that you need to dig further.

The words paternal and patrilineal aren’t the same thing. Paternal refers to the paternal half of your family, where patrilineal is the direct father to father line.

Just because you don’t have any surname matches doesn’t necessarily mean it’s because of what you’re thinking.

Short tandem repeats (STRs) and single nucleotide polymorphisms (SNPs) aren’t the same thing and are used differently in genealogy.

Piecing together your ancestor’s Y DNA from descendants.

Haplogroups are something like our pedigree charts.

What does it mean when you have a zero for a marker value?

There’s more than one way to break down that brick wall. Here’s how I figured out which of 4 sons was my ancestor.

Just because you match the right line autosomally doesn’t mean it’s because you descend from the male child you think is your ancestor. Females gave their surnames to children born outside of a legal marriage which can lead to massive confusion. This is absolutely why you need to test the Y DNA of every single ancestral line.

When the direct patrilineal line isn’t the line you’re expecting.

You can now tell by looking at the flags on the haplotree where other people’s ancestral lines on your branch are from. This is especially useful if you’ve taken the Big Y test and can tell you if you’re hunting in the right location.

If you’re just now testing or tested in 2018 or after, you don’t need to read this article unless you’re interested in the improvements to the Big Y test over the years.

2019 was a banner year for discovery. 2020 was even more so, keeping up an amazing pace. I need to write a 2020 update article.

What is a terminal SNP? Hint – it’s not fatal😊

How the TIP calculator works and how to best interpret the results. Note that this tool is due for an update that incorporates more markers and SNP results too.

You can view the location of the Y DNA and mitochondrial DNA ancestors of people whose ethnicity you match.

Tools and Techniques

This free public tree is amazing, showing locations of each haplogroup and totals by haplogroup and country, including downstream branches.

Need to search for and find Y DNA candidates when you don’t know anyone from that line? Here’s how.

Yes, it’s still possible to resolve this issue using autosomal DNA. Non-matching Y DNA isn’t the end of the road, just a fork.

Science Meets Genealogy – Including Ancient DNA

Haplogroup C was an unexpected find in the Americas and reaches into South America.

Haplogroup C is found in several North American tribes.

Haplogroup C is found as far east as Nova Scotia.

Test by test, we made progress.

New testers, new branches. The research continues.

The discovery of haplogroup A00 was truly amazing when it occurred – the base of the phylotree in Africa.

The press release about the discovery of haplogroup A00.

In 2018, a living branch of A00 was discovered in Africa, and in 2020, an ancient DNA branch.

Did you know that haplogroups weren’t always known by their SNP names?

This brought the total of SNPs discovered by Family Tree DNA in mid-2018 to 153,000. I should contact the Research Center to see how many they have named at the end of 2020.

An academic paper split ancient haplogroup D, but then the phylogenetic research team at FamilyTreeDNA split it twice more! This might not sound exciting until you realize this redefines what we know about early man, in Africa and as he emerged from Africa.

Ancient DNA splits haplogroup P after analyzing the remains of two Jehai people from West Malaysia.

For years I doubted Kennewick Man’s DNA would ever be sequenced, but it finally was. Kennewick Man’s mitochondrial DNA haplogroup is X2a and his Y DNA was confirmed to Q-M3 in 2015.

Compare your own DNA to Vikings!

Twenty-seven Icelandic Viking skeletons tell a very interesting story.

Irish ancestors? Check your DNA and see if you match.

Ancestors from Hungary or Italy? Take a look. These remains have matches to people in various places throughout Europe.

The Y DNA story is no place near finished. Dr. Miguel Vilar, former Lead Scientist for National Geographic’s Genographic Project provides additional analysis and adds a theory.

Webinars

Y DNA Webinar at Legacy Family Tree Webinars – a 90-minute webinar for those who prefer watching to learn! It’s not free, but you can subscribe here.

Success Stories and Genealogy Discoveries

Almost everyone has their own Y DNA story of discovery. Because the Y DNA follows the surname line, Y DNA testing often helps push those lines back a generation, or two, or four. When STR markers fail to be enough, we can turn to the Big Y-700 test which provides SNP markers down to the very tip of the leaves in the Y DNA tree. Often, but not always, family-defining SNP branches will occur which are much more stable and reliable than STR mutations – although SNPs and STRs should be used together.

Methodologies to find ancestral lines to test, or maybe descendants who have already tested.

DNA testing reveals an unexpected mystery several hundred years old.

When I write each of my “52 Ancestor” stories, I include genetic information, for the ancestor and their descendants, when I can. Jacob was special because, in addition to being able to identify his autosomal DNA, his Y DNA matches the ancient DNA of the Yamnaya people. You can read about his Y DNA story in Jakob Lenz (1748-1821), Vinedresser.

Please feel free to add your success stories in the comments.

What About You?

You never know what you’re going to discover when you test your Y DNA. If you’re a female, you’ll need to find a male that descends from the line you want to test via all males to take the Y DNA test on your behalf. Of course, if you want to test your father’s line, your father, or a brother through that father, or your uncle, your father’s brother, would be good candidates.

What will you be able to discover? Who will the earliest known ancestor with that same surname be among your matches? Will you be able to break down a long-standing brick wall? You’ll never know if you don’t test.

You can click here to upgrade an existing test or order a Y DNA test.

Share the Love

You can always forward these articles to friends or share by posting links on social media. Who do you know that might be interested?

_____________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Products and Services

Genealogy Research

Books

Y DNA Haplogroup P Gets a Brand-New Root – Plus Some Branches

With almost 35,000 branches comprised of 316,000 SNPs, branches on the Y DNA tree are split every day. In fact, roughly 1000 branches are being added to the Y DNA tree of mankind at Family Tree DNA each month. I wrote about how to navigate their public tree, here, and you can view the tree, here. You can also read about Y DNA terminology, here.

Splitting a deep, very old branch into subclades is unusual – and exciting. Finding a new root, taking the entire haplogroup back another notch in time is even more amazing, especially when that root is 46,000 years old.

Haplogroup P is the parent haplogroup of both Q and R.

This portion of the 2010 haplogroup poster provided to Family Tree DNA conference attendees shows the basic branching structure of haplogroup P, R and Q, with haplogroup P being defined at that time by several equivalent SNPs that had not yet been split into any other subgroups or branches of P. Notice that P295 is shown, but not F115 or PF5850 which would be discovered in years to come.

Haplogroup R, a subclade of P, is the most common haplogroup in Europe, with roughly half of European men falling on some branch of haplogroup R.

Map and haplogroup R distribution courtesy of FamilyTreeDNA

In Ireland, nearly all men fall into a subgroup of haplogroup R.

A lot of progress has been made in the past decade.

This week, FamilyTreeDNA identified a split in haplogroup P, upstream of haplogroups Q and R, establishing a new root above haplogroup P-P295.

The Previous 2020 Tree

This is a 2020 “before” picture of the tree as it pertains to haplogroup P. You can see P-P295 at the top as the root or beginning mutation that defined haplogroup P. That was, of course, before this new discovery.

click to enlarge

At Family Tree DNA, according to this tree where testers self-identify the location of their most distant known patrilineal ancestor, haplogroup P testers are found in multiple Asian locations. Some haplogroup P kits may have only purchased specific SNP tests, not the full Big Y and would actually be placed on downstream branches if they upgraded. Haplogroup P itself is quite rare and generally only found in Siberia, Southeast Asia, and diaspora regions.

Subgroups Q and R are found across Europe and Asia. Additionally, some subgroups of haplogroup Q migrated across the land bridge, Beringia, to populate the Americas.

You might be wondering – if there are only a few people who fall directly into haplogroup P, how was it split?

Great question.

How Was Haplogroup P Split?

Testing of ancient DNA has been a boon to science and genealogy, both, and one of my particular interests.

Recently, Goran Runfeldt who heads the R&D team at FamilyTreeDNA was reading the paper titled Ancient migrations in Southeast Asia and noticed that in the supplementary material, several genomic files from ancient samples were available to download. Of course, that was just the beginning, because the files had to be aligned and processed – then the accuracy verified – requiring input from other team members including Michael Sager who maintains the Y DNA haplotree.

Additionally, the paper’s authors sequenced the whole genomes of two present-day Jehai people from Northern Parak State, West Malaysia, a small group of traditional hunter-gatherers, many of whom still live in isolation. One of those samples was the individual whose Y DNA provided the new root SNP, P-PF5850, that is located above the previous root of haplogroup P, P-P295.

Until this sample was analyzed by Goran, Michael and team, three SNPs, PF5850, P295 and F115, were considered to be equivalent, because no tie-breaker had surfaced to indicate which SNPs occurred in what order. Now we know that PF5850 happened first and is the root of haplogroup P.

I asked Michael Sager, the phylogeneticist at FamilyTreeDNA, better-known as “Mr. Big Y,” due to his many-years-long Godfather relationship with the Y DNA tree, how he knew where to place PF5850, and how it became a new root.

Michael explained that we know that P-PF5850 is the new root because the three SNPs that indicated the previous root, P295, PF5850 and F115 are present in all previous samples, but mutations at both P295 and F115 are absent in the new sample, indicating that PF5850 preceded what is now the old P root.

The two SNPs, P295 and F115 occurred some time later.

This sample also included more than 300 additional unique mutations that may become branches in the future. As more people test and more ancient samples are found and sequenced, there’s lots of potential for further branching. Even with more than 50,000 NGS Big-Y DNA tests in the Family Tree DNA database, there’s still so much we don’t know, yet to be discovered.

Amazingly, mutation P-PF5850 occurred approximately 46,000 years ago meaning that this branch had remained hidden all this time. For all we know, he might be the only man left alive with this particular lineage of mankind, but it’s likely more will surface eventually.

click to enlarge

Michael Sager had previously analyzed samples from The population history of northeastern Siberia since the Pleistocene by Sikora et al. You’ll notice that additional branches of haplogroup P are reflected in ancient samples Yana1 and Yana2 which split P-M45, twice.

Branch Definitions

Today, haplogroup branches are defined by their SNP name, except for base and main branches such as P, P1, P2, etc. Haplogroup P is very old and you’ll find it referred to as simply P, P1 or P2 in most literature, not by SNP name. Goran labeled the old branch names beside the current SNP names, and provided a preliminary longhand letter+number branch name with the * for explanatory purposes.

The problem with the old letter+number system is that when new upstream branches are inserted, the current haplogroup “P” has to shift down and become something else. That’s problematic when reading papers. In order to understand which SNP the paper is actually referencing, you have to know what SNP was labeled as “P” at the time the paper was written.

For example, a new P was just defined, so P becomes P1, but the previous P1 has to become something else, resulting in a domino effect of renaming. While that’s not a significant issue with haplogroup P, because it has seldom changed, it’s a huge challenge with the 17,000+ haplogroup R branches. Hence, the transition several years ago to using SNP names such as P295 instead of the older letter+number designations such as P, which now needs to become something like P1.

Haplogroup Ages

Goran was kind enough to provide additional information as well, including the estimated “Time to Most Recent Common Ancestor,” or TMRCA, a feature currently in development for all haplogroups. You can see that P-PF5850 is estimated to be approximately 46,000 years old, “ca 46 kybp,” meaning “circa 46 thousand years before present.”

The founding ancestor of haplogroup Q lived approximately 31,000 years ago, and ancestral R lived about 28,000 years ago, someplace in Asia. Their common ancestor, P-P226, lived about 33,000 years ago.

How cool is this that you can peer back in time to view these ancient lineages – the story still told in our Y DNA today.

What About You?

If you’re a male, you can upgrade to or purchase a Big Y-700 to participate, here. In addition to discovering where you fall on the tree of mankind, you’ll discover who you match on your direct patrilineal side and where their ancestors are located in the world.

_____________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Products and Services

Genealogy Research

23andMe Genetic Tree Provides Critical Clue to Solve 137-Year-Old Disappearance Mystery

DNA can convey messages from the great beyond – from times past and people that died long before we were born.

I had the most surprising experience this week. It began with receiving an email with the sender name of my long-time research buddy, cousin Garmon Estes.

It’s all the more surprising because not only did Garmon never own a computer, despite my ceaseless encouragement, he passed over in 2013 at the age of 85. So, imagine my shock to open my email to see a message from Garmon. Queue up spooky music😊

As it turned out, Garmon’s nephew is also Garmon. I had communicated with the family off and on over the years since the death of Garmon the elder. Garmon, the younger, had written to tell me that the second “great brick wall” that haunted his Uncle Garmon had fallen – and how that happened, thanks to DNA.

Garmon, the Elder

Estes Garmon

Garmon Estes, the elder

I first met Garmon the elder, via letter, back in the 1970s or maybe early 80s. He was an experienced genealogist and I was beginning.

At that time, Garmon had been chasing the identity of the father of our common ancestor, John R. Estes, for decades, and I was just embarking on what would become a lifelong adventure, or perhaps it could better be called an obsession.

John R. Estes had moved from some unknown location to Claiborne County, Tennessee with his wife and family about 1820. That’s pretty much all we knew at that time. Garmon had spent decades before the age of online records researching every John Estes he could find. I can’t even begin to tell you how many John Esteses existed that needed to be eliminated as candidates.

Garmon lived in California, far from Tennessee. I lived in Indiana, then Michigan – significantly closer. He began caring for his ill spouse, and I began traveling to dusty courthouses, sometimes reading musty books page by yellowed page, extracting everything Estes. Garmon worked from his local Family History Center when he could and wrote letters.

Between our joint sleuthing and many theories that we both composed and subsequently shot down, we narrowed John R. Estes’s location of origin to Halifax County, Virginia. However, there were multiple John Esteses living there at the same time, about the same age, none using middle initials reliably, and some not at all. How inconsiderate!

I began perusing every possible record. I had eliminated some Johns as candidates, most often because they clearly remained in the community after our John had moved to Claiborne County. Late one night, in our local family history center, I found that fateful clue – John R. Estes noted as (S.G.) short for “son of George,” on just one tax list. All it takes is that one gold-nugget record.

It was after 10 PM when I left the Family History Center and even later when I got home. I debated whether I should call Garmon or not, but I decided that indeed, he would want to know immediately, even if I did call at an inconvenient time or wake him up.

The discovery of John’s father, of course, opened the door for much more research, and it solved one of Garmon’s two brick walls that had haunted his genealogy life.

He never solved the second one, but it wasn’t for lack of trying.

What Happened to Willis Alexander Garmon Estes?

Willis Alexander Garmon Estes was born on December 21, 1854, in Lenoir, Roane County, TN. His nickname was Willie.

Willie married Martha Lee Mathis in 1874 and they had 4 children beginning with the first child born the next year in Roane County. Sometime between 1875 and the birth of the second child in 1877, they migrated to Greenwood, Wise County, Texas where their next two children were born in 1877 and 1881.

Martha was pregnant for their fourth child in 1883 when something very strange happened. Willie disappeared, and I do mean literally and completely. Just poof, gone.

Not sure what to do, Martha’s father, who lived in Missouri, went to Texas to retrieve his pregnant daughter and her children and took her and the children home to Missouri where their last child was born that September.

Willie was only 28 when he vanished. The family, of course, had many stories about what happened. Texas at that time was pretty much the “wild west” and the stories about Willie reflected exactly that.

Texas was sometimes the refuge of outlaws and shady characters. One story revealed that Willie had shot a man back in Tennessee and the family fled to Louisiana, then Texas. Of course, that doesn’t tell us why he disappeared in Texas, but it opens the door to speculation and casts doubt on his character, perhaps.

Another story was that he was shot by Indians.

A third story stated that Willie settled in Indian Territory north of the Red River, now Oklahoma, and that he had an altercation with an Indian over the supposed theft of firewood, although who was accusing who was unclear. Willie shot the Indian, then had to flee for his life, leaving his pregnant wife and children as a posse of Indian Police surrounded his house. Willie supposedly promised Martha that he would return, but never did. It was reported that he was shot in Mexico, but no further details emerged.

Aren’t these just maddeningly vague???

Yet another story was that Willie headed for the goldfields of California, struck it rich, and was murdered on the way back home. The details varied, but one version had him murdered by a traveling companion on the trail. Another had him becoming ill and dying in a hospital in St. Louis where his wife went to search for him, to no avail. That might explain why she went back to Missouri, Garmon postulated. And yet a third version was some hybrid of the two where “someone” tried to find Willie’s family for years to reveal what had happened, and where, but was never successful. Of course, how did the family know about this if the mystery person was unable to find the family? But I digress.

Garmon desperately wanted to solve that mystery. He wanted closure.

I didn’t realize that the genealogy bug had bitten Garmon’s nephew too, but it clearly has. Garmon would be so proud.

With Garmon the younger’s permission, I’m publishing “the rest of the story,” Connecting the Dots, as written by Garmon the younger, with a few technical interjections from me involving DNA from time to time.

Connecting the Dots

In 2015, My dad Richard Estes, my brother Corey Estes, and I took a trip to Texas and Oklahoma to see if we could find out more about Willis Alexander Garmon Estes’ disappearance.

Estes greenwood

We visited Greenwood, Texas and nearby Decatur where we looked at historical records at the Wise County Clerk Office. We also went up to Oklahoma City to see the state archives and to Tishomingo to look at any records that might be available.

Estes Oklahoma history.png

Interestingly enough, we did not find any clues as to the disappearance of Willis Alexander Garmon Estes. There were no newspaper articles or criminal records concerning any incidents with Willis Alexander Garmon Estes. The only new information that we found was a couple of land deeds showing that Willis Alexander Garmon Estes’ brother Fielding had bought and sold land in Wise County during the time that Willis Alexander Garmon Estes was living in Greenwood.

We left empty-handed on our trip but our curiosity remained strong and we began talking to each other about going on another trip to Tennessee to speak with Estes family members in Loudon County to see if they might know something about Willis Alexander Garmon’s disappearance.

DNA Testing

In December of 2018, my wife, children, and I had our DNA tested using the service 23andMe. We received test results within a month of sending in saliva samples. The results did not reveal anything unusual.

Fast forward to October 2019. 23andMe introduced a new Family Tree feature that automatically creates a family tree based on the DNA results that you share with relatives in 23andMe. This was a fascinating feature and I noticed that all of my family members were automatically placed into the correct position on the family tree without me having to do anything.

[Roberta’s note – this is not always the case, so don’t necessarily expect the same level of accuracy. The tree is a wonderful innovative feature, just treat family placement as hints and not facts.]

Every few weeks as more and more people had their DNA tested on 23andMe, new relatives were added to the family tree.

In February 2020, I noticed something interesting under the location of Willis Alexander Garmon Estes on the family tree. A woman by the name of Edna appeared as a descendent of Willis Alexander Garmon Estes. The first thing I did was to try and get in contact with her on 23andMe. No luck. Next, I thought maybe she was the descendent of one of Willis Alexander Garmon’s sons (James, John, or George). However, after researching the descendants of each of those lines, Edna’s name did not appear.

The next step I took was to look up as many Ednas by that last name on ancestry.com as I could find and trace their ancestry back to see where it led.

There were two Ednas by that last name in the United States whose age matched the one on 23andMe. I traced both of their ancestry lines back to the 1800’s. Neither one had Willis Alexander Garmon Estes as an ancestor.

Breakthrough

During the middle of March 2020, when I was quarantined at home from work due to the COVID-19 virus, I took another look at Edna’s family lines. I noticed there was a gentleman by the name of James Henry Houston mentioned as an ancestor.

The interesting thing about James was that he was born on the same day, same year, and in the same county as Willis Alexander Garmon Estes. James Henry Houston was born on December 26, 1854 in Loudon County, Tennessee. This seemed like possibly more than a coincidence, so I dived into the data a little bit more.

I looked at federal census records to find out more about James Henry Houston’s past. Strangely there were no official records of him until May 12, 1889 when he married Allie Ona Taylor in Erath, Texas. Normally, if someone is born in 1854, they would show up in one of the federal census records of 1860, 1870, or 1880. James Henry Houston does not show up in any official federal census records until 1900.

According to ancestry records, James Henry Houston married Allie Ona Taylor in 1889 and resided in the Hood County region of Texas until 1910. During this time, he raised 8 children with his wife Allie.

In 1920, the federal census placed him and Allie in Whitehall, Montana. The last federal census he appears in is 1930. He lived in Pomona, California where he died in 1933 at the age of 78.

At this point, I thought it was highly likely that James Henry Houston and Willis Alexander Garmon Estes were the same person. If my hunch was correct then a photo of James Henry Houston would most likely show a resemblance to his son, my great grandfather John Alexander Estes.

Estes James Henry Houston

The photos above show a remarkable similarity in the eyes, nose, mouth, and facial structure between the two men. To me, the photo and historical evidence is enough to conclude that Willis Alexander Garmon Estes is James Henry Houston.

Garmon’s Concluding Thoughts

As I reflect on the fact that Willis Alexander Garmon Estes renamed himself James Henry Houston and moved from Wise County down to Hood County, Texas – approximately 60 miles distance to marry and raise a new family, many more questions come to mind.

What exactly happened to cause Willis Alexander Garmon Estes to leave his wife and children behind? Was it simply a marital dispute or did it involve a criminal offense and running from the law as was mentioned in the family lore?

Did my great grandfather know that his father lived in Pomona in 1930, which was only 6 miles away from where he was living in Rancho Cucamonga? Were there other family members that knew what happened but promised not to tell anyone else? We may never know.

Finally, I want to add one more piece to the story that I found fascinating. On ancestry.com, many of the family trees for James Henry Houston state that the mother and father of James Henry Houston was Jennie Bray and Henry Houston. No information is given for their birthdates or where they came from. The mother and father of Willis Alexander Garmon Estes was Jennie McVey and William Estes. The names Jennie Bray and Jennie McVey are very similar. In order to hide his true identity, James Henry Houston would have to make up a surname for his father since he called himself Houston, not Estes. Willis Alexander Garmon Estes had a brother named John Houston Estes. This might explain why James Henry Houston chose to use the surname Houston rather than another name.

Congratulations Garmon

I know this made Garmon the elder puff up with pride for Garmon the younger’s sleuthing skills and leap for joy at the solve. Garmon, the elder, had two main genealogy goals throughout his entire life. One was solved while he was living, but it took another generation to solve this one.

Great job, Garmon!

About the 23andMe Genetic Tree

23andMe is the only vendor to construct a “trial balloon” genetic tree based only on how the tester matches people and how they do, or don’t, match each other. This occurs with no input from testers in the form of genealogical trees of identifying how people are related to the tester.

Family Tree DNA has Phased Family Matching, MyHeritage has Theories of Family Relativity, and Ancestry has ThruLines which all do some sort of DNA+tree+relationship connectivity, but since 23andMe does not support user-created or uploaded trees, anything they produce has to be using DNA alone.

On one hand, it’s frustrating for genealogists, but on the other hand, there is sometimes a benefit to a different “all genetic” approach.

Of course, the only information that 23andMe has to utilize unless your parents have tested is how closely you match your matches and how closely your matches match each other. This allows 23andMe to place your matches at least in a “neighborhood” on your tree, at least approximately accurate, unless your parents are related to each other and that shared DNA causes things to get dicey quickly.

I wrote about 23andMe’s new relationship triangulation tree when it was first introduced in September 2019, nearly a year ago, here. The launch was rocky for a number of reasons, and if you’ve done genealogy for a long time, your research goals are likely to be further back in time than this 4 generation relationship tree will reveal.

23andMe tree

Click to enlarge

This is what my relationship tree looked like at the time the function was launched. You’ll note that 23andMe places relationships back in time 4 generations, to your great-great-grandparents, meaning that you might have 3rd or even 4th cousins showing up on your genetic tree.

I initially had a total of 18 people placed on my tree, with 3 being close family, 4 being accurate, 4 unknown, 1 uncertain and 6, or one third, inaccurate.

Keep in mind that 23andMe doesn’t make any provision to accommodate or take into account half-relationships, like half-brother or half-sister, either currently or historically. Therefore, descendant placement predictions can be “off” because half-siblings only carry the DNA from one common parent, instead of two, making those relationships appear more distant than they really are.

In Garmon’s case, his great-great-grandfather is the ancestor who was MIA, so the genetic tree has the potential to work well for this purpose.

Estes 23andme tree today

click to enlarge

Today, my tree looks somewhat different, with only 14 people displayed instead of 18, and 6 waiting in the wings to see if I can help 23andMe figure out how and where to place them.

Since the initial launch, customers have been given the opportunity to add their ancestors’ names to their nodes. This works just fine so long as nobody married more than once and had children from both marriages.

Estes Willie Alexander today

click to enlarge

 

Here’s a closer image of the left-hand side of my tree where I’ve super-imposed the location of Willis Alexander Garmon Estes and Edna, as they are related to Garmon the Younger, at bottom right. Ignore the other names – I only utilized my own tree for an example tree structure.

One more generation and it’s unlikely that 23andMe would have made the connection between Edna and Garmon the younger.

Not only does this illustrate the perfect reason to test the oldest generations in your family, but also never to ignore an unknown match that seems to be within the past 3 or 4 generations. You never know what mysteries you might unravel.

Four generations actually reaches back in time quite substantially. In my case, my great-great-grandparents were born in 1805, 1810, 1812, 1813, 1815, 1816, 1818 (2), 1820, 1822, 1827, 1829, 1830, 1832, 1841 and 1848.

If you have mysteries within your closest 4 generations to unravel, the genetic tree at 23andMe might provide valuable clues, but only if you’re willing to do the requisite work to figure out HOW these people match you.

You can’t transfer your DNA file TO 23andMe, so if you want to have your results in the 23andMe database, you’ll need to test there.

Acknowledgments: Thank you to Garmon Estes, the younger, for generously sharing this story and allowing publication. My heart was warmed to see your generational research trip.

Thank you to Garmon Estes, the elder, for being my research partner for so many years. You can finally RIP now, although somehow I suspect you already have these answers.

_____________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Products and Services

Genealogy Research

6 & 7 cM Matches: Are 172 ThruLines All Wrong?

Are some 6-8 cM matches valid and valuable? If not, then are my 172 ThruLines that Ancestry created for me that include my 8 great-grandparents surnames at that level all wrong? Or the total of 552 ThruLines at 6 and 7 cMs all wrong?

We all know by now that about half of 6 and 7 cM matches will be identical by chance, meaning not valid, but that leaves about half that ARE valid. We need clues to be able to figure out IF these matches are valid, and the logical place to start is by utilizing three techniques.

  • First, if both of our parents have tested, does the person also match our parent, and if a chromosome browser is available, on the same segment.

If the answer is no, no need to go any further, this match is not valid. If yes, then we know if phases through one generation and we need to keep looking for evidence.

  • Second, the same litmus test, but with our closest known relatives that have tested. Does the match also match aunts, uncles, siblings, first cousins, or other known proven close relatives? Of course, if they match on the same segment, that’s family phasing and the beginning of triangulation and strongly, strongly suggests descent from the same common identified ancestor.

Note that Ancestry does NOT show you Shared Matches below 20 cM, so don’t assume those shared matches to family members don’t exist. Check your family members’ kits directly. Don’t rely only on Ancestry’s shared matches.

  • Third, surnames and trees that suggest common ancestral lines of DNA matches. That’s what Ancestry does for us with ThruLines. Let’s take a look at what I’ve found sorting and grouping my 6-8 cM matches at Ancestry.

There’s way more information than I expected to find.

Focus on Grouping

With Ancestry’s upcoming purge of all DNA customers’ 6 and 7 cM matches, inclusive, I’ve been very focused on grouping and saving those matches for future use. Otherwise, they will be gone forever, along with my genetic connection and any useful genealogical information.

I’ve written about the upcoming Ancestry purge here, here and here – including preservation strategies and how to communicate with Ancestry to share your feelings about this topic if you so choose. Note that this disproportionately affects people seeking unknown ancestors a few generations back in time.

Raise your hand if you have no unknown ancestors before 1870 or so…

Ancestry’s 6-8 cM Matches

I’ve been recording statistics as I’ve been grouping and working with results, and thought I’d share what I’ve found with you.

Ancestry tota.png

I have a total of 92,931 matches at Ancestry. This includes endogamous Acadian, Mennonite and Brethren lines, which produce lots of matches, but also multiple German and Dutch lines of relatively recent immigrants with almost no testers. So it probably evens out.

You’ll note that of my matches, 3,757 are estimated by Ancestry to be 4th cousin or closer, and Ancestry categorizes the rest of them as Distant matches, from 6-20 cM, although some of those wind up being closer than 4th cousins.

I have 27,926 6-cM matches, 16,846 7-cM matches and 11,428 8-cM matches. I was initially saving 8-cM matches because Ancestry was initially rounding 7.6 up to 8 and the only way to save all 7-cM matches was to save all 8-cM matches. Last week, Ancestry added decimal points so you don’t have to save 8-cM matches anymore, just all 6 and 7.

Without additional tools, all of those matches are overwhelming – but that’s exactly WHY we need technologies such as clustering, triangulation, ThruLines which Ancestry provides, a chromosome browser, family phasing, shared matches below 20 cM, and more.

You can certainly look at known genealogy and make inferences about common ancestors when you match someone genetically, and that’s very useful in and of itself.

However, you need more than just the fact that you match someone to confirm that you share a specific common ancestor biologically, not just on paper. Having said that, just having the breadcrumb of a DNA match to lead you to your cousins isn’t a bad thing in and of itself.

Of my total matches:

  • 18% are 7 cM
  • 30% are 6 cM
  • That’s a total of 48% of my matches that would have been lost later in August if I hadn’t grouped them.

Some people feel that matches at this level aren’t useful, but the line in the sand is very thin between a 7.99 cM deleted match and an 8.0 retained match where the former is lumped into the “not useful, so no big deal to lose” bucket and the other is just fine and potentially useful.

I get it, I really do, that everyone gets tired of explaining that NO, you can’t find one match and assume a valid connection, and yes, digging for evidence is work. There is no magic wand. Smaller or larger matches, they all need additional cumulative evidence to indicate that the match is valid, and how.

It’s time-consuming and frustrating educating people HOW to utilize all DNA matching appropriately. Those smaller matches take more effort to work with and require more evidence of legitimacy, but there are absolutely, assuredly many legitimate, useful, matches between 6-8 cM.

Furthermore, many of those matches reach back in time to those elusive ancestors we are seeking and can’t yet identify. We need more and better tools, not less data. Conversely, some 6-8 cM matches are as close as third or fourth cousins. I found 4 in one family and we’re sharing photos of our ancestors who were siblings, born in 1827 and 1829, respectively.

I’m not throwing half of my 6-8 cM coins away because some are gold and some are counterfeit.

If you are, I’ll take all of your coins and I’ll be happy to sort out the gold, thank you😊

Where’s the Gold?

Ancestry filter

You can search and sort in any number of ways at Ancestry. First, I checked to see how many of my 6 and 7 cM matches had common ancestors as identified by Ancestry via Thrulines.

6 cM 7 cM Total
Common Ancestors (ThruLines) 274 278 552

If I had not grouped these, I would have lost all 552 matches that Ancestry connected to common ancestors through ThruLines. Of course, each connection needs to be individually verified using traditional genealogical record searches. Keep in mind that ThruLines can only find matches where people connect in trees.

Without these 6 and 7 cM matches, any connecting genetic path or breadcrumbs to these people is gone.

Great-grandparents’ Surnames

Since I can filter by segment match size and surname, combined, at Ancestry, I decided to take a look at my 6-7 cM matches that would be purged had I not grouped them, and see what I can discover by surname utilizing the surnames of my great-grandparents.

That’s just 3 generations for me, meaning I could expect to carry more of the DNA of these ancestors than of ancestors further back in time.

I started with the “Match name” of Estes, meaning that the person who took the test has that name. Of course, some women could use their married surname, so this doesn’t mean that my match to that person is via that surname. It’s just a starting point, but probably a good hint.

I had 12 Estes surname matches in the 6-7 cM range. Of those:

  • 4 had no tree
  • 1 had a private tree
  • 1 had an unlinked tree
  • None had common identified ancestors meaning ThruLines
  • That leaves me with 7 candidates to work with directly, including the unlinked tree
  • Of those, I knew how 5 of their trees connect to the Estes line

Of course, I have the benefit of having worked with the Estes genealogy for decades along with the benefit of trees and other resources not at Ancestry. Connecting these lines took me about 15 minutes. In essence, I’ve turned them into virtual “ThruLines” by identifying the common ancestor, even if Ancestry didn’t.

I have not yet worked with the rest of my surname matches in the same way, but by preserving them by grouping, I can in the future.

I searched for both the “Match Name” and the “Surname in the Matches’ Trees,” separately. Some who carry the surname aren’t going to have trees and conversely, finding the surname in your matches’ trees is by no means an indication that that particular surname or ancestor is why you’re matching. However, it’s a great hint and a place to begin your research, including shared matches.

Be sure to check alternate spellings of surnames too.

Note that a surname that can also be part of a name returns all possible connections. For example if I’m searching for the Lore surname and the name of my match is Loreal Jones, it will still appear in the Match name list. The same applies to the name of the managing person.  However, scrolling through these is pretty easy.

So, what did I find?

Results!

I created this chart of what I discovered using the surnames of my great-grandparents along with common alternate spellings.

Surname Match Name Surname in Matches’ Trees Comments
Estes, Eastes 13 matches, no ThruLines 208 matches, 20 Thrulines
Bolton 6, no ThruLines 121, 14 Thrulines All 6 surname matches have trees and I can place some immediately.
Vannoy, Van Noy 2, no ThruLines 49, 10 ThruLines I can place 1 of the 2 surname matches and connect them to the Vannoy line. Their tree is unlinked and another is private. Checking the “include similar surnames box” resulted in 2355 results. Won’t do that again.
Ferverda, Fervida, Ferwerda 0 2, no ThruLines Confirmed a common ancestor in the Netherlands with one tester. An 1860s immigrant line.
Miller 175, 1 ThruLine 2248, 95 ThruLines Very common surname and Brethren. Shared matches, if over 20 cM which is Ancestry’s threshold would potentially be very helpful.
Clarkson, Claxton 2, no ThruLines 96, 22 ThruLines I need to break down a brick wall in this line. Also, maybe someone has a photo of my great-grandmother. I was able to provide a photo of someone else’s ancestors discovered as a 6 and 7 cM match to 4 family members.
Lore, Lord 112, no ThruLines 209, 10 ThruLines Acadian, endogamous. Lore is part of many other names.
Kirsch 0 18, 0 ThruLines 1850s German immigrant line. This was VERY helpful. I’ve already found previously unknown cousins and one line that I thought was defunct, isn’t.
Total 310, 1 ThruLine 2951, 171 ThruLines Total 3261 matches and 172 ThruLines

I’m not willing to throw these away.

Continue to Provide Feedback to Ancestry

I find the assertion that these smaller matches are neither accurate nor valuable simply mind-boggling. Clearly, as you can see above, these matches provide invaluable clues for us, as genealogists, to follow. Over time, I’ve proven many matches in this range (who have tested at or transferred to other vendors with a chromosome browser) to triangulate with several generations of family members using DNAPainter, so at least some matches are quite valid. And yes, we do have tools to accumulate evidence – the same exact tools we use for larger matches.

Imagine how much else is actually buried in those matches that could be distilled into useful information with technology tools.

I fully understand it’s in Ancestry’s best interest to delete these matches to free up processing resources, but I’m far from convinced that it’s in our best interest as avid genealogists.

I also realize that many if not most genealogists who aren’t as focused as many of you reading this article won’t notice or care, but that’s not the case for truly committed genealogists with years invested in this work. There’s valuable information there for those of us willing to commit our resources and invest our time to work on the matches.

The Proof is in the Pudding

The proof is in the results – those 3,261 surname matches that serve as immediate hints and 172 ThruLines that Ancestry themselves has assembled for us.

The more I work with these matches, the LESS convinced I am that they should be deleted. There is certainly chaff to be sifted and discarded, but Ancestry could take a more precise, surgical approach instead of a wholesale decapitation that will remove 48% of my matches and more for other people. I would certainly be more than happy to be part of a proactive discussion focusing on how to delete less useful matches or those we’ve determined to be invalid, but preserve the rest.

Of course, the easiest option would simply be for Ancestry to allow us to elect to retain current and elect to receive future 6-8 cM matches by checking a simple box and continue to provide those for those of us who care and are willing to work with them.

Yes, the remaining matches after the purge will indeed “be more accurate,” as Ancestry says, because fewer will be false, but many of the very matches you need to identify those elusive distant ancestors will almost assuredly be gone. The baby will have been thrown out with the bathwater.

It’s generally not any individual match itself, but groups or clusters of matches that point the way – shared matches and ThruLines. If half or more of the cluster we need is gone, with no way to connect the genetic dots, we may never discover the identity of those ancestors. That’s a shame, because it negates the very benefit of being in the largest autosomal database. In a way, both Ancestry and we as their clients are victims of their own success.

Perhaps Ancestry will yet reverse their decision and if not, perhaps Ancestry’s competitors will see an unfulfilled opportunity here. I’d be glad to be a part of those discussions as well.

Take a look. What valuable nuggets are hiding in your smaller matches? Be sure to group those matches to prevent their deletion.

Provide Feedback to Ancestry

There’s still time to provide your feedback to Ancestry if you don’t want to lose your 6-8 cM matches later this month. Ancestry needs to serve all of their genealogical customers who have taken DNA tests, not just the most convenient. I encourage Ancestry to develop useful tools as others have done instead of deleting the matches we need in order to unmask those unknown ancestors.

  • Email Ancestry support at ancestrysupport@ancestry.com although there have been reports from some that this email doesn’t work, so you may need to utilize another contact method.
  • You can initiate an online “chat” here.
  • Call ancestry support at 1-899-958-9124 although people have been reporting obtaining offshore call-centers and problems understanding representatives. You also may need to ask for a supervisor.
  • Ancestry corporate headquarters phone number on the website is listed as 801-705-7000.
  • You can’t post directly on Ancestry’s Facebook page, but you can comment on posts and you can message them.
  • Ancestry’s Twitter feed is here.

_____________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Products and Services

Genealogy Research

Rare African Y DNA Haplogroup A00 Sprouts New Branches

In 2012, the great-grandson of Albert Perry, a man born into slavery in South Carolina, tested his Y DNA and the result was the groundbreaking discovery of haplogroup A00, a very ancient branch of the Y tree found in Africa.

The results were announced at the Family Tree DNA Conference in 2012 and published the following year.

Early Y DNA tree dating was imprecise at best. As the tree expands and additional branches are added, our understanding of the Y tree structure, the movement of peoples, and the evolution of branches is enhanced.

In 2015, two Mbo people from Cameroon tested as described in the paper by Karmin et al.

A00 tree.png

Click to enlarge

Those men added branch A-YP2683 to the tree.

In 2018, a paper by D’Atanasio et al sequenced 104 living males including a man from Cameroon which added branch A-L1149.

In 2020, the paper by Lipson et all found an ancient branch of A00 subsequently named A-L1087 that was added above A00, dating from between 3,000 and 8,000 years ago and believed to have been found among the remains of Bantu-speakers. Of course, that doesn’t tell us when A-L1087 occurred, but it does tell us that it occurred sometime before they were born.

How do you like the little skull indicating ancient DNA, as compared to the flags indicating the location of the earliest known ancestor of present-day testers? I’m very pleased to see ancient DNA results being incorporated into the tree.

A00 Lipson

What About Albert Perry’s Great-Grandson’s Y DNA?

The Y DNA of Albert Perry’s great-grandson had never been NGS sequenced with either the Big Y-500 or the current Big Y-700. NGS technology for Y DNA wasn’t yet available at the time. Is there more information to be gleaned from his DNA?

Recently, Albert Perry’s great-grandson’s DNA was upgraded to the Big Y-700, and two other descendants of Albert Perry tested at the Big Y-700 level as well.

The original 2012 tester, Albert Perry’s great-grandson, added branch A-L1100, and Albert’s great-great and great-great-great-grandsons split his branch once again by adding branch A-FT272432.

The haplogroup A Y DNA tree shows the new tree structure.

Looking at the Block Tree at FamilyTreeDNA, Albert Perry’s descendants are shown, along with the ancient sample at the far right.

A00 Perry block tree.png

Click to enlarge

Because so few men have tested and fallen into this line, the dark blue equivalent SNPs reach far back in time. As more men test, these will eventually be broken into individual branches.

The men who carry these important SNPs and their branching information will either be men from Africa or the diaspora.

I would like to thank the Perry family for their continuing contributions to science.

_____________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Products and Services

Genealogy Research

Plea to Ancestry – Rethink Match Purge Due to Deleterious Effect on African American Genealogists

I know this article is not going to be popular with some people and probably not with Ancestry, but this is something I absolutely must say. Those of us in the position of influencers with a public voice bear responsibility for doing such.

Let me also add that if you are of European heritage and you think this topic doesn’t apply to you – if you have any unidentified ancestors – it does. Don’t discount and skip over. Please read. Our voices need to be heard in unison.

Ancestry Lewis.jpg

The Bottom Line

Here’s the bottom line. Ancestry’s planned purge of smaller segments, 6-8 cM, is the exact place that African Americans (and mixed Native Americans too) find their ancestral connections. This community has few other options.

I’m sure, given the Ancestry blog post by Margo Georgiadis, Ancestry’s President and CEO on June 3rd that this detrimental effect is not understood nor intentional.

Ancestry Margo

Margo goes on to say, “At Ancestry, our products seek to democratize access to everyone’s family story and to bring people together.”

Yet, this planned match purge at the beginning of August does exactly the opposite. The outpouring of anguish from African American researchers has been palpable as they’ve described repeatedly how they use these segments to identify their genetic ancestors.

Additionally, my own experiences with discovering several African American cousins over the past few days as I’ve been working to preserve these smaller segment matches has been pronounced. I can even tell them which family they connect through. A gift them simply cannot receive in any other way – other than genetic connections

These two factors, combined, the community outcry and my own recent experiences are what have led me to write this article. In other words, I simply can’t NOT write it.

I trust and have faith that Ancestry will rethink their decision and utilize this opportunity for good and take positive action. Accordingly, I’ve provided suggestions for how Ancestry can make changes that will allow people on both sides of this equation, meaning those who want to keep those smaller segment matches and those glad to be rid of them, to benefit – and how to do this before it’s too late.

I don’t know if Ancestry has African American genealogists who are both passionate and active, or mixed-race genealogists, on their management decision-making team or in their influencer group, but they should.

I don’t think Ancestry realizes the impact of what they are doing. African American research is different. Here’s why.

African American History and Genetic Genealogy

Slavery ended in the US in the 1860s. Formerly enslaved persons who had no agency and control over their own lives or bodies then adopted surnames.

We find them in the 1870 census carrying a surname of unknown origin. Some adopted their former owner’s surname, some adopted others. Generally today, their descendants don’t know why or how their surnames came to be.

Almost all descendants of freed slaves are admixed today, a combination of African, European and sometimes Native Americans who were enslaved alongside Africans.

Closer DNA matches reflect known and unknown family in the 3 or 4 generations since 1870, generally falling in the 2nd to 4th cousin range, depending on the ages of the people at the time of emancipation and also the distance between births in subsequent generations.

Ancestry freed ancestors.png

The three red generations are the potential testers today. The cM values, the amount of potential matching DNA at those relationship levels are taken from DNAPainter, here, which is an interactive representation of Blaine Bettinger’s Shared cM Project.

Assuming we’re not dealing with an adoption or unknown parent situation, most people either know or can fairly easily piece together their family through first or second cousins.

You can see that it’s not until we get to the third and fourth cousin level that genealogists potentially encounter small segment matches. However, at that level, the average match is still significantly above the Ancestry purge threshold of 6-8 cM. In other words, we might lose some of those matches, but the closer the match, the higher the probability that we will match them (at all) and that we will match them above the purge threshold.

Looking again at the DNAPainter charts, we see that it’s not until we move further out in terms of relationships that the average drops to those lower ranges.

Ancestry DNAPainter

Here’s the challenge – relationships that occurred before the time of emancipation are only going to be reflected in relationships more distant than fourth cousins – and that is the exact range where smaller segment matches can and do come into play most often.

The more distant the relationship, the smaller the average amount of shared DNA, which means the more likely you are ONLY to be able to identify the relationship through repeated matching of other people who share that same ancestor.

Let me give you an example. If you match repeatedly to a group of people who descend from Thomas Dodson in colonial Virginia, through multiple children, especially on the same segment, you need to focus on the Dodson family in your research. If you’re a male and your Y DNA matches the Dodson line closely, that’s a huge hint. This holds for any researcher, especially for females without surnames, but it applies to all ancestral lines for African American researchers.

If an African American researcher is trying to identify their genetic ancestors, that likely includes ancestors of European origin. Yes, this is an uncomfortable topic, but it’s the unvarnished truth.

Full stop.

How Can African Americans Identify European Ancestors?

While enslaved people did not have surnames from the beginning of their history on these shores until emancipation, European families did. Male lines carried the same surname generation to generation, and female surnames changes in a predictable pattern, allowing genealogists to track them backward in time (hopefully.)

Given that African American researchers are literally “flying blind,” attempting to identify people with whom to reconnect, with no knowledge of which families or surnames, they must be able to use both DNA matches and the combined ancestral trees of their matches in order to make meaningful connections.

For more information on how this is accomplished, please read the articles here and here.

Tool or Method How it Works Available at Ancestry?
Y DNA for males Identifies the direct paternal line by surnames and also the haplogroup provides information as to the ancestral source such as European, African, Asian or Native American. No, only available at FamilyTreeDNA.
Mitochondrial DNA Identifies the direct matrilineal line. The haplogroup shows the ancestral source such as European, Native American, Asian or African. You can read about the different kinds of DNA, here. No, only available at FamilyTreeDNA
Clustering Identifies people all matching the tester and also matching to each other. No, available through Genetic Affairs and DNAGedcom before Ancestry issued a cease and desist letter to them in June.
Genetic Trees Tools to combine the trees of your matches to each other to identify common ancestors of your matches. You do not need a known tree for this to work. No, available at Genetic Affairs before Ancestry issued a cease and desist letter to them.
Downloading Match Information Including the direct ancestors for your matches. No, Ancestry does not allow this, and tools like Pedigree Thief and DNAGedcom that did provide this functionality were served with cease-and-desist orders.
Painting Segments Painting segments at DNAPainter allows the tester to identify the ancestral source of their segments. Multiple matches to people with the same ancestor indicates descent from that line. This is how I identify which line my matches are related to me through – and how I can tell my African American cousins how they are related and which family they descend from. No. Ancestry does not provide segment location information, so painting is not possible with Ancestry matches unless both people transfer to companies that provide matching segment information and a chromosome browser (MyHeritage, FamilyTreeDNA)
ThruLines at Ancestry Matches your tree to same ancestor in other people’s trees. ThruLines is available to all testers, but the tester MUST have a tree and some connection to an ancestor in their tree before this works. Potential ancestors are sometimes suggested predicated on people already in the tester’s tree connected to ancestors in their matches trees. For ThruLines to work, a connection must be in someone’s tree so a connection can be made. There are no tree links for pre-emancipation owned families. Those connections must be made by DNA.
DNA Matching Matching shows who you match genetically. Testers must validate that the match is identical by descent and not identical by chance by identifying the segment’s ancestry and confirming through either a parental match or matching to multiple cousins descending from the same ancestor at that same location. Segments of 7 cM have about a 50-50 chance of being legitimate and not false matches. Of course, that means that 50% are valid and tools can be utilized to determine which matches are and are not valid. All matches are hints, one way or another. You can read more, here. Ancestry performs matching, but does not provide segment information. Testers can, however, look for multiple matches with the same ancestors in their trees. Automated tools such as Genetic Affairs cannot be used, so this needs to be done one match at a time. The removal of smaller segment matches will remove many false matches, but will also remove many valid matches and with them, the possibility of using those matches to identify genetic ancestors several generations ago, before 1870.
Shared Matching Shows tester the people who match in common with them and another match. Ancestry only shows shared matches of “fourth cousins and closer,” meaning only 20 cM and above. This immediately eliminates many if not most relevant shared matches from before emancipation – along with any possibility of recovering that information.

The Perfect, or Imperfect, Storm

As you can see from the chart above, African American genealogists are caught in the perfect, or imperfect, storm. Many tools are not available at Ancestry at all, and some that were have been served with cease-and-desist letters.

The segments this community most desperately needs to make family connections are the very ones most in jeopardy of being removed. They need the ability to look at those matches, not just alone, but in conjunction with people they match in clusters, plus trees of those clustered matches to identify their common ancestors.

Ancestry has the largest database but provides very few tools to benefit people who are searching for unknown ancestors, especially before 1850 – meaning people who don’t have surnames to work with.

Of course, this doesn’t just apply to African American researchers, but any genealogist who is searching for women whose surnames they don’t know. This also applies to people with unknown parentage that occurred a few generations back in time.

However, the difference is that African American genealogists don’t have ANY surnames to begin with. They literally hit their brick wall at 1870 and need automated tools to breach those walls. Removing their smaller segment matches literally removes the only tool they have to work with – the small scraps and tidbits available to them.

Yes, false matches will be removed, but all of their valid matches in that range will be removed too – nullifying any possibility of discovery.

A Plan Forward

You’ve probably figured out by now that I’m no longer invited to the Ancestry group calls. I’m fine with that because I’m not in any way constrained by embargoes or expectations. I only mention this for those of you who wonder why I’m saying this now, publicly, and why I didn’t say it earlier, privately, to Ancestry. I would have, had the opportunity arisen.

That said, I want to focus on finding a way forward.

Some options are clearly off the table. I’m sure Ancestry is not going to add Y or mitochondrial DNA testing, since they did that once and destroyed that database, along with the Sorenson database later. I’m equally as sure that they are not going to provide segment location information or a chromosome browser. I know that horse is dead, but still, chromosome browser…

My goal is to identify some changes Ancestry can make quickly that will result in a win-win for all researchers. It goes without saying that if researchers are happy, they buy more kits, and eventually, Ancestry will be happier too.

Right now, there are a lot, LOT, of unhappy researchers, but not everyone. So what can we do to make everyone happier?

Immediate Solutions

  • Remove the cease and desist orders from the third-party tools like Genetic Affairs, DNAGedcom, Pedigree Thief and other third-party tools that researchers use for clustering, automated tree construction, downloading and managing matches.

This action could be implemented immediately and will provide HUGE benefits for the African American research community along with anyone who is searching for ancestors with no surnames. Who among us doesn’t have those?

  • Instead of purging small segment matches, implement a setting where people can define the threshold where they no longer see matches. The match would still appear to the other person. If I don’t want to see matches under 8 cM, I can select that level. If someone else wants to see all matches to 6 cM, they simply do nothing and see everything.
  • Continue to provide new matches to the 6 cM level. In other words, don’t just preserve what’s there today, but continue to provide this match level to genealogists.
  • Add shared matches under 20 cM so that genealogists know they do form clusters with multiple matches.

Longer-Term Solutions

  • Partner with companies like Genetic Affairs and DNAgedcom, tools that provided not just match data, but automated solutions. These wouldn’t have been so popular if they weren’t so effective.
  • Implement some form of genetic networks, like clustering. Alternatively, form alliances with and embrace the tools that already exist.

The Message Customers Hear

By serving the third-parts tools that serious genealogists used daily with cease-and-desist orders, then deleting many of our matches that can be especially useful when combined with automated tools, the message to genealogists is that our needs aren’t important and aren’t being heard.

For African American genealogists, these tools and smaller matches are the breadcrumbs, the final breadcrumb trail when there is nothing else at all that has the potential to connect them with their ancestors and connect us all together.

Let me say this again – many African Americans have nothing else.

To remove these small matches, rays of hope, is nothing short of immeasurably cruel, and should I say it, just one more instance of institutionalized racism, perpetrated without thinking. One more example of things the African American community cannot have today because of what happened to them and their ancestors in their past.

Plea

I will close this plea to Ancestry with another quote from Margaret Georgiadis from Ancestry’s blog.

Ancestry Margo 2.png

Businesses don’t get to claim commitment when convenient and then act otherwise. I hope this article has helped Ancestry to see a different perspective that they had not previously understood. Everyone makes mistakes and has to learn, companies included.

Ancestry, this ball’s in your court.

Feedback to Ancestry

I encourage you to provide feedback to Ancestry, immediately, before it’s too late.

You can do this by any or all of the following methods:

Ancestry support

Ancestry BLM.png

Speak out on social media, in groups where you are a member, or anyplace else that you can. Let’s find a solution, quickly, before it’s too late in another 10 days or so.

As John Lewis said, #goodtrouble.

Make a difference.

_____________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Products and Services

Genealogy Research

Fun DNA Stuff

  • Celebrate DNA – customized DNA themed t-shirts, bags and other items

Genographic Project Participants: Last Chance to Preserve Your Results & Advance Science – Deadline June 30th

If you’re one of the one million+ public participants in the National Geographic Society’s Genographic Project, launched in 2005, you probably already know that testing has ceased and the website will be discontinued as of June 30th. Your results will no longer be available as of that date.

I wrote about the closing here and you can read what the Genographic project has to say about closing the public participation part of the project, here.

However, this doesn’t have to be the end of the DNA story.

You have great options for yourself and to continue the science. Your results can still be useful, however…

You MUST act before June 30th.

Please note that if you control the DNA of a deceased person who did not test elsewhere, this is literally your last chance to obtain any DNA results for them. If you transfer their DNA, you can upgrade and purchase additional tests at Family Tree DNA. If you don’t transfer, the opportunity to retrieve their DNA will be gone forever.

Three Steps + a Bonus

  1. Preserve Your Results – Sign in to the Genographic site and take screenshots, print, or download any data you wish to keep.
  2. Contribute to Science – Authorize the Genographic Project to utilize your results for ongoing scientific research, including The Million Mito Project
  3. Transfer Your Results – If you tested before November 2016, you can transfer your results to FamilyTreeDNA and order upgrades if a sample remains

Here are step-by-step instructions for completing all three.

First – Preserve Your Results

Sign on to your account at The Genographic Project. You’ll notice an option to print your results.

Geno profile

Scroll down and take one last look. Did you miss anything?

Your profile page includes the ability to download your raw genetic data.

Geno profile option

Your Account page, below, will look slightly different depending on the version of the test you took, but the download option is present for all versions of the test.

Geno download

The download file simply shows raw data values at specific positions and won’t be terribly useful to you.

Geno nucleotides

Generally, it’s the analysis of what these mutations mean, or matching to others for genealogy, that people seek.

At the very bottom of your results page, you’ll see the option to Contribute to Science.

Geno contribute

Click on “How You Can Help.”

Second – Contribute to Scientific Research

The best way to assure the legacy of the Genographic Project is to opt-in for science research.

You can learn more about what happens when you authorize your results for scientific research, here.

Geno contribute box

Checking the little box authorizes anonymized scientific research on your sample now and in the future. This assures that your results won’t be destroyed on June 30th and will continue to be available to scientists.

The Genographic Project celebrated its 15th birthday in April 2020. Genographic Project data, including over 80,000 local and indigenous participants from over 100 countries, in addition to contributed public participation samples, has been included in approximately 85 research papers worldwide. Collaborative research is still underway. There’s still so much to learn.

Dr. Miguel Vilar, the lead scientist for the Genographic Project, is a partner in The Million Mito Project. The anonymized mitochondrial results of people who have opted-in for science will be available to that project, and others, through Dr. Vilar. Please support rewriting the tree of womankind by opting-in for scientific research.

Those words, “in the future” are the key to making sure this critical opportunity to continue the science doesn’t die.

If you don’t want to scroll down your page, you can access the scientific contribution authorization page directly from your profile.

Geno profile 2

To contribute to science, Click on the “My Contribution to Science” tab.”

Geno profile contribute

You’ll see the following screen. Then, check the box and click on the yellow “Contribute to Science” button. You’ll then be prompted with a few questions about your maternal and paternal heritage.

Geno check box

Contributing your results to science helps further scientific research into mankind, but transferring your results to FamilyTreeDNA preserves the usefulness of your DNA results for you and facilitates upgrading your DNA to obtain even more information.

Transferring also allows you to participate fully in The Million Mito Project which requires a full sequence mitochondrial DNA sample.

Third – Transfer Your Results to FamilyTreeDNA

If you tested before November 2016 when the Genographic Project switched to Helix for processing, you can transfer your results easily to Family Tree DNA.

If you don’t remember when you tested, sign in to your account. It’s easy to tell if transferring is an option.

Geno transfer option

If you are eligible to transfer, you’ll see this transfer option when you sign in.

Just click on the “Transfer Your Results” button. If you don’t want to sign in to Genographic to do the transfer, just click on this transfer link directly.

Geno transfer FTDNA

You will then see this no-hassle transfer option on the Family Tree DNA web page. Because FamilyTreeDNA did the laboratory processing for the Genographic Project from its inception in 2005 until November 2016, all you need to do is enter your Genographic kit number and the transfer takes place automatically.

Please note that if you DON’T transfer NOW, the Genographic Project is requesting the destruction of all non-transferred kits after June 30th, per their website.

Geno destroy

As you might imagine, preserving the DNA of a deceased person is critical if they didn’t test elsewhere and you have the authority to manage their DNA.

In order to support The Million Mito Project, Family Tree DNA is emailing a coupon to all people who transfer, offering a discount to upgrade to a full sequence mitochondrial DNA test.

After you transfer to Family Tree DNA, be sure to enter your earliest known ancestor and upload a tree. Here’s my “Four Quick Tips” article about getting the most out of mitochondrial DNA result, but it’s sage advice for Y DNA as well.

Bonus – Upgrade Transferred Kits

If you transfer your Genographic results to FamilyTreeDNA, you can then utilize the DNA sample provided for your Genographic DNA test for additional testing

Different versions of the Genographic Project testing provided various types of results for your DNA. In some versions, testers received 12 Y STR markers or partial mitochondrial DNA results, and in other versions, partial haplogroups. You can only transfer what the Genographic provided, of course, but once transferred, you can order products and upgrades at Family Tree DNA, assuming a sample remains.

This is important, especially if you control the kit for a loved one who has now passed away. This may be your only opportunity to obtain their Y, mitochondrial, and/or autosomal DNA results. For example, my mother passed away before autosomal DNA testing was possible, but I’ve since upgraded her test at Family Tree DNA and was able to do so because her DNA was archived.

Support Science

Please support The Million Mito Project and other academic research by:

  • Choosing to contribute to science through the Genographic project and
  • By transferring your results to Family Tree DNA so that you can learn more and upgrade

Both options are totally free, and both equally important.

Time is of the essence. You must act before June 30th.

Don’t let this be goodbye, simply au revior – the legacy of your DNA can live on in another place, another way, another day.

_____________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Products and Services

Genealogy Research

Y DNA: Step-by-Step Big Y Analysis

Many males take the Big Y-700 test offered by FamilyTreeDNA, so named because testers receive the most granular haplogroup SNP results in addition to 700+ included STR marker results. If you’re not familiar with those terms, you might enjoy the article, STRs vs SNPs, Multiple DNA Personalities.

The Big Y test gives testers the best of both, along with contributing to the building of the Y phylotree. You can read about the additions to the Y tree via the Big Y, plus how it helped my own Estes project, here.

Some men order this test of their own volition, some at the request of a family member, and some in response to project administrators who are studying a specific topic – like a particular surname.

The Big Y-700 test is the most complete Y DNA test offered, testing millions of locations on the Y chromosome to reveal mutations, some unique and never before discovered, many of which are useful to genealogists. The Big Y-700 includes the traditional Y DNA STR marker testing along with SNP results that define haplogroups. Translated, both types of test results are compared to other men for genealogy, which is the primary goal of DNA testing.

Being a female, I often recruit males in my family surname lines and sponsor testing. My McNiel line, historic haplogroup R-M222, has been particularly frustrating both genealogically as well as genetically after hitting a brick wall in the 1700s. My McNeill cousin agreed to take a Big Y test, and this analysis walks through the process of understanding what those results are revealing.

After my McNeill cousin’s Big Y results came back from the lab, I spent a significant amount of time turning over every leaf to extract as much information as possible, both from the Big Y-700 DNA test itself and as part of a broader set of intertwined genetic information and genealogical evidence.

I invite you along on this journey as I explain the questions we hoped to answer and then evaluate Big Y DNA results along with other information to shed light on those quandaries.

I will warn you, this article is long because it’s a step-by-step instruction manual for you to follow when interpreting your own Big Y results. I’d suggest you simply read this article the first time to get a feel for the landscape, before working through the process with your own results. There’s so much available that most people leave laying on the table because they don’t understand how to extract the full potential of these test results.

If you’d like to read more about the Big Y-700 test, the FamilyTreeDNA white paper is here, and I wrote about the Big Y-700 when it was introduced, here.

You can read an overview of Y DNA, here, and Y DNA: The Dictionary of DNA, here.

Ok, get yourself a cuppa joe, settle in, and let’s go!

George and Thomas McNiel – Who Were They?

George and Thomas McNiel appear together in Spotsylvania County, Virginia records. Y DNA results, in combination with early records, suggest that these two men were brothers.

I wrote about discovering that Thomas McNeil’s descendant had taken a Y DNA test and matched George’s descendants, here, and about my ancestor George McNiel, here.

McNiel family history in Wilkes County, NC, recorded in a letter written in 1898 by George McNiel’s grandson tells us that George McNiel, born about 1720, came from Scotland with his two brothers, John and Thomas. Elsewhere, it was reported that the McNiel brothers sailed from Glasgow, Scotland and that George had been educated at the University of Edinburgh for the Presbyterian ministry but had a change of religious conviction during the voyage. As a result, a theological tiff developed that split the brothers.

George, eventually, if not immediately, became a Baptist preacher. His origins remain uncertain.

The brothers reportedly arrived about 1750 in Maryland, although I have no confirmation. By 1754, Thomas McNeil appeared in the Spotsylvania County, VA records with a male being apprenticed to him as a tailor. In 1757, in Spotsylvania County, the first record of George McNeil showed James Pey being apprenticed to learn the occupation of tailor.

If George and Thomas were indeed tailors, that’s not generally a country occupation and would imply that they both apprenticed as such when they were growing up, wherever that was.

Thomas McNeil is recorded in one Spotsylvania deed as being from King and Queen County, VA. If this is the case, and George and Thomas McNiel lived in King and Queen, at least for a time, this would explain the lack of early records, as King and Queen is a thrice-burned county. If there was a third brother, John, I find no record of him.

My now-deceased cousin, George McNiel, initially tested for the McNiel Y DNA and also functioned for decades as the family historian. George, along with his wife, inventoried the many cemeteries of Wilkes County, NC.

George believed through oral history that the family descended from the McNiel’s of Barra.

McNiel Big Y Kisumul

George had this lovely framed print of Kisimul Castle, seat of the McNiel Clan on the Isle of Barra, proudly displayed on his wall.

That myth was dispelled with the initial DNA testing when our line did not match the Barra line, as can be seen in the MacNeil DNA project, much to George’s disappointment. As George himself said, the McNiel history is both mysterious and contradictory. Amen to that, George!

McNiel Big Y Niall 9 Hostages

However, in place of that history, we were instead awarded the Niall of the 9 Hostages badge, created many years ago based on a 12 marker STR result profile. Additionally, the McNiel DNA was assigned to haplogroup R-M222. Of course, today’s that’s a far upstream haplogroup, but 15+ years ago, we had only a fraction of the testing or knowledge that we do today.

The name McNeil, McNiel, or however you spell it, resembles Niall, so on the surface, this made at least some sense. George was encouraged by the new information, even though he still grieved the loss of Kisimul Castle.

Of course, this also caused us to wonder about the story stating our line had originated in Scotland because Niall of the 9 Hostages lived in Ireland.

Niall of the 9 Hostages

Niall of the 9 Hostages was reportedly a High King of Ireland sometime between the 6th and 10th centuries. However, actual historical records place him living someplace in the mid-late 300s to early 400s, with his death reported in different sources as occurring before 382 and alternatively about 411. The Annals of the Four Masters dates his reign to 379-405, and Foras Feasa ar Eirinn says from 368-395. Activities of his sons are reported between 379 and 405.

In other words, Niall lived in Ireland about 1500-1600 years ago, give or take.

Migration

Generally, migration was primarily from Scotland to Ireland, not the reverse, at least as far as we know in recorded history. Many Scottish families settled in the Ulster Plantation beginning in 1606 in what is now Northern Ireland. The Scots-Irish immigration to the states had begun by 1718. Many Protestant Scottish families immigrated from Ireland carrying the traditional “Mc” names and Presbyterian religion, clearly indicating their Scottish heritage. The Irish were traditionally Catholic. George could have been one of these immigrants.

We have unresolved conflicts between the following pieces of McNeil history:

  • Descended from McNeil’s of Barra – disproved through original Y DNA testing.
  • Immigrated from Glasgow, Scotland, and schooled in the Presbyterian religion in Edinburgh.
  • Descended from the Ui Neill dynasty, an Irish royal family dominating the northern half of Ireland from the 6th to 10th centuries.

Of course, it’s possible that our McNiel/McNeil line could have been descended from the Ui Neill dynasty AND also lived in Scotland before immigrating.

It’s also possible that they immigrated from Ireland, not Scotland.

And finally, it’s possible that the McNeil surname and M222 descent are not related and those two things are independent and happenstance.

A New Y DNA Tester

Since cousin George is, sadly, deceased, we needed a new male Y DNA tester to represent our McNiel line. Fortunately, one such cousin graciously agreed to take the Big Y-700 test so that we might, hopefully, answer numerous questions:

  • Does the McNiel line have a unique haplogroup, and if so, what does it tell us?
  • Does our McNiel line descend from Ireland or Scotland?
  • Where are our closest geographic clusters?
  • What can we tell by tracing our haplogroup back in time?
  • Do any other men match the McNiel haplogroup, and what do we know about their history?
  • Does the Y DNA align with any specific clans, clan history, or prehistory contributing to clans?

With DNA, you don’t know what you don’t know until you test.

Welcome – New Haplogroup

I was excited to see my McNeill cousin’s results arrive. He had graciously allowed me access, so I eagerly took a look.

He had been assigned to haplogroup R-BY18350.

McNiel Big Y branch

Initially, I saw that indeed, six men matched my McNeill cousin, assigned to the same haplogroup. Those surnames were:

  • Scott
  • McCollum
  • Glass
  • McMichael
  • Murphy
  • Campbell

Notice that I said, “were.” That’s right, because shortly after the results were returned, based on markers called private variants, Family Tree DNA assigned a new haplogroup to my McNeill cousin.

Drum roll please!!!

Haplogroup R-BY18332

McNiel Big Y BY18332

Additionally, my cousin’s Big Y test resulted in several branches being split, shown on the Block Tree below.

McNIel Big Y block tree

How cool is this!

This Block Tree graphic shows, visually, that our McNiel line is closest to McCollum and Campbell testers, and is a brother clade to those branches showing to the left and right of our new R-BY18332. It’s worth noting that BY25938 is an equivalent SNP to BY18332, at least today. In the future, perhaps another tester will test, allowing those two branches to be further subdivided.

Furthermore, after the new branches were added, Cousin McNeill has no more Private Variants, which are unnamed SNPs. There were all utilized in naming additional tree branches!

I wrote about the Big Y Block Tree here.

Niall (Or Whoever) Was Prolific

The first thing that became immediately obvious was how successful our progenitor was.

McNiel Big Y M222 project

click to enlarge

In the MacNeil DNA project, 38 men with various surname spellings descend from M222. There are more in the database who haven’t joined the MacNeil project.

Whoever originally carried SNP R-M222, someplace between 2400 and 5900 years ago, according to the block tree, either had many sons who had sons, or his descendants did. One thing is for sure, his line certainly is in no jeopardy of dying out today.

The Haplogroup R-M222 DNA Project, which studies this particular haplogroup, reads like a who’s who of Irish surnames.

Big Y Match Results

Big Y matches must have no more than 30 SNP differences total, including private variants and named SNPs combined. Named SNPs function as haplogroup names. In other words, Cousin McNeill’s terminal SNP, meaning the SNP furthest down on the tree, R-BY18332, is also his haplogroup name.

Private variants are mutations that have occurred in the line being tested, but not yet in other lines. Occurrences of private variants in multiple testers allow the Private Variant to be named and placed on the haplotree.

Of course, Family Tree DNA offers two types of Y DNA testing, STR testing which is the traditional 12, 25, 37, 67 and 111 marker testing panels, and the Big Y-700 test which provides testers with:

  • All 111 STR markers used for matching and comparison
  • Another 589+ STR markers only available through the Big Y test increasing the total STR markers tested from 111 to minimally 700
  • A scan of the Y chromosome, looking for new and known SNPs and STR mutations

Of course, these tests keep on giving, both with matching and in the case of the Big Y – continued haplogroup discovery and refinement in the future as more testers test. The Big Y is an investment as a test that keeps on giving, not just a one-time purchase.

I wrote about the Big Y-700 when it was introduced here and a bit later here.

Let’s see what the results tell us. We’ll start by taking a look at the matches, the first place that most testers begin.

Mcniel Big Y STR menu

Regular Y DNA STR matching shows the results for the STR results through 111 markers. The Big Y section, below, provides results for the Big Y SNPs, Big Y matches and additional STR results above 111 markers.

McNiel Big Y menu

Let’s take a look.

STR and SNP Testing

Of Cousin McNeil’s matches, 2 Big Y testers and several STR testers carry some variant of the Neal, Neel, McNiel, McNeil, O’Neil, etc. surnames by many spellings.

While STR matching is focused primarily on a genealogical timeframe, meaning current to roughly 500-800 years in the past, SNP testing reaches much further back in time.

  • STR matching reaches approximately 500-800 years.
  • Big Y matching reaches approximately 1500 years.
  • SNPs and haplogroups reach back infinitely, and can be tracked historically beyond the genealogical timeframe, shedding light on our ancestors’ migration paths, helping to answer the age-old question of “where did we come from.”

These STR and Big Y time estimates are based on a maximum number of mutations for testers to be considered matches paired with known genealogy.

Big Y results consider two men a match if they have 30 or fewer total SNP differences. Using NGS (next generation sequencing) scan technology, the targeted regions of the Y chromosome are scanned multiple times, although not all regions are equally useful.

Individually tested SNPs are still occasionally available in some cases, but individual SNP testing has generally been eclipsed by the greatly more efficient enriched technology utilized with Big Y testing.

Think of SNP testing as walking up to a specific location and taking a look, while NGS scan technology is a drone flying over the entire region 30-50 times looking multiple times to be sure they see the more distant target accurately.

Multiple scans acquiring the same read in the same location, shown below in the Big Y browser tool by the pink mutations at the red arrow, confirm that NGS sequencing is quite reliable.

McNiel Big Y browser

These two types of tests, STR panels 12-111 and the SNP-based Big Y, are meant to be utilized in combination with each other.

STR markers tend to mutate faster and are less reliable, experiencing frustrating back mutations. SNPs very rarely experience this level of instability. Some regions of the Y chromosome are messier or more complicated than others, causing problems with interpreting reads reliably.

For purposes of clarity, the string of pink A reads above is “not messy,” and “A” is very clearly a mutation because all ~39 scanned reads report the same value of “A,” and according to the legend, all of those scans are high quality. Multiple combined reads of A and G, for example, in the same location, would be tough to call accurately and would be considered unreliable.

You can see examples of a few scattered pink misreads, above.

The two different kinds of tests produce results for overlapping timeframes – with STR mutations generally sifting through closer relationships and SNPs reaching back further in time.

Many more men have taken the Y DNA STR tests over the last 20 years. The Big Y tests have only been available for the past handful of years.

STR testing produces the following matches for my McNiel cousin:

STR Level STR Matches STR Matches Who Took the Big Y % STR Who Took Big Y STR Matches Who Also Match on the Big Y
12 5988 796 13 52
25 6660 725 11 57
37 878 94 11 12
67 1225 252 21 23
111 4 2 50 1

Typically, one would expect that all STR matches that took the Big Y would match on the Big Y, since STR results suggest relationships closer in time, but that’s not the case.

  • Many STR testers who have taken the Big Y seem to be just slightly too distant to be considered a Big Y match using SNPs, which flies in the face of conventional wisdom.
  • However, this could easily be a function of the fact that STRs mutate both backward and forwards and may have simply “happened” to have mutated to a common value – which suggests a closer relationship than actually exists.
  • It could also be that the SNP matching threshold needs to be raised since the enhanced and enriched Big Y-700 technology now finds more mutations than the older Big Y-500. I would like to see SNP matching expanded to 40 from 30 because it seems that clan connections may be being missed. Thirty may have been a great threshold before the more sensitive Big Y-700 test revealed more mutations, which means that people hit that 30 threshold before they did with previous tests.
  • Between the combination of STRs and SNPs mutating at the same time, some Big Y matches are pushed just out of range.

In a nutshell, the correlation I expected to find in terms of matching between STR and Big Y testing is not what I found. Let’s take a look at what we discovered.

It’s worth noting that the analysis is easier if you are working together with at least your closest matches or have access via projects to at least some of their results. You can see common STR values to 111 in projects, such as surname projects. Project administrators can view more if project members have allowed access.

Unexpected Discoveries and Gotchas

While I did expect STR matches to also match on the Big Y, I don’t expect the Big Y matches to necessarily match on the STR tests. After all, the Big Y is testing for more deep-rooted history.

Only one of the McNiel Big Y matches also matches at all levels of STR testing. That’s not surprising since Big Y matching reaches further back in time than STR testing, and indeed, not all STR testers have taken a Big Y test.

Of my McNeill cousin’s closest Big Y matches, we find the following relative to STR matching.

Surname Ancestral Location Big Y Variant/SNP Difference STR Match Level
Scott 1565 in Buccleuch, Selkirkshire, Scotland 20 12, 25, 37, 67
McCollum Not listed 21 67 only
Glass 1618 in Banbridge, County Down, Ireland 23 12, 25, 67
McMichael 1720 County Antrim, Ireland 28 67 only
Murphy Not listed 29 12, 25, 37, 67
Campbell Scotland 30 12, 25, 37, 67, 111

It’s ironic that the man who matches on all STR levels has the most variants, 30 – so many that with 1 more, he would not have been considered a Big Y match at all.

Only the Campbell man matches on all STR panels. Unfortunately, this Campbell male does not match the Clan Campbell line, so that momentary clan connection theory is immediately put to rest.

Block Tree Matches – What They Do, and Don’t, Mean

Note that a Carnes male, the other person who matches my McNeill cousin at 111 STR markers and has taken a Big Y test does not match at the Big Y level. His haplogroup BY69003 is located several branches up the tree, with our common ancestor, R-S588, having lived about 2000 years ago. Interestingly, we do match other R-S588 men.

This is an example where the total number of SNP mutations is greater than 30 for these 2 men (McNeill and Carnes), but not for my McNeill cousin compared with other men on the same S588 branch.

McNiel Big Y BY69003

By searching for Carnes on the block tree, I can view my cousin’s match to Mr. Carnes, even though they don’t match on the Big Y. STR matches who have taken the Big Y test, even if they don’t match at the Big Y level, are shown on the Block Tree on their branch.

By clicking on the haplogroup name, R-BY69003, above, I can then see three categories of information about the matches at that haplogroup level, below.

McNiel Big Y STR differences

click to enlarge

By selecting “Matches,” I can see results under the column, “Big Y.” This does NOT mean that the tester matches either Mr. Carnes or Mr. Riker on the Big Y, but is telling me that there are 14 differences out of 615 STR markers above 111 markers for Mr. Carnes, and 8 of 389 for Mr. Riker.

In other words, this Big Y column is providing STR information, not indicating a Big Y match. You can’t tell one way or another if someone shown on the Block Tree is shown there because they are a Big Y match or because they are an STR match that shares the same haplogroup.

As a cautionary note, your STR matches that have taken the Big Y ARE shown on the block tree, which is a good thing. Just don’t assume that means they are Big Y matches.

The 30 SNP threshold precludes some matches.

My research indicates that the people who match on STRs and carry the same haplogroup, but don’t match at the Big Y level, are every bit as relevant as those who do match on the Big Y.

McNIel Big Y block tree menu

If you’re not vigilant when viewing the block tree, you’ll make the assumption that you match all of the people showing on the Block Tree on the Big Y test since Block Tree appears under the Big Y tools. You have to check Big Y matches specifically to see if you match people shown on the Block Tree. You don’t necessarily match all of them on the Big Y test, and vice versa, of course.

You match Block Tree inhabitants either:

  • On the Big Y, but not the STR panels
  • On the Big Y AND at least one level of STRs between 12 and 111, inclusive
  • On STRs to someone who has taken the Big Y test, but whom you do not match on the Big Y test

Big Y-500 or Big Y-700?

McNiel Big Y STR differences

click to enlarge

Looking at the number of STR markers on the matches page of the Block Tree for BY69003, above, or on the STR Matches page is the only way to determine whether or not your match took the Big Y-700 or the Big Y-500 test.

If you add 111 to the Big Y SNP number of 615 for Mr. Carnes, the total equals 726, which is more than 700, so you know he took the Big Y-700.

If you add 111 to 389 for Mr. Riker, you get 500, which is less than 700, so you know that he took the Big Y-500 and not the Big Y-700.

There are still a very small number of men in the database who did not upgrade to 111 when they ordered their original Big Y test, but generally, this calculation methodology will work. Today, all Big Y tests are upgraded to 111 markers if they have not already tested at that level.

Why does Big Y-500 vs Big Y-700 matter? The enriched chemistry behind the testing technology improved significantly with the Big Y-700 test, enhancing Y-DNA results. I was an avowed skeptic until I saw the results myself after upgrading men in the Estes DNA project. In other words, if Big Y-500 testers upgrade, they will probably have more SNPs in common.

You may want to contact your closest Big Y-500 matches and ask if they will consider upgrading to the Big Y-700 test. For example, if we had close McNiel or similar surname matches, I would do exactly that.

Matching Both the Big Y and STRs – No Single Source

There is no single place or option to view whether or not you match someone BOTH on the Big Y AND STR markers. You can see both match categories individually, of course, but not together.

You can determine if your STR matches took the Big Y, below, and their haplogroup, which is quite useful, but you can’t tell if you match them at the Big Y level on this page.

McNiel Big Y STR match Big Y

click to enlarge

Selecting “Display Only Matches With Big Y” means displaying matches to men who took the Big Y test, not necessarily men you match on the Big Y. Mr. Conley, in the example above, does not match my McNeill cousin on the Big Y but does match him at 12 and 25 STR markers.

I hope FTDNA will add three display options:

  • Select only men that match on the Big Y in the STR panel
  • Add an option for Big Y on the advanced matches page
  • Indicate men who also match on STRs on the Big Y match page

It was cumbersome and frustrating to have to view all of the matches multiple times to compile various pieces of information in a separate spreadsheet.

No Big Y Match Download

There is also no option to download your Big Y matches. With a few matches, this doesn’t matter, but with 119 matches, or more, it does. As more people test, everyone will have more matches. That’s what we all want!

What you can do, however, is to download your STR matches from your match page at levels 12-111 individually, then combine them into one spreadsheet. (It would be nice to be able to download them all at once.)

McNiel Big Y csv

You can then add your Big Y matches manually to the STR spreadsheet, or you can simply create a separate Big Y spreadsheet. That’s what I chose to do after downloading my cousin’s 14,737 rows of STR matches. I told you that R-M222 was prolific! I wasn’t kidding.

This high number of STR matches also perfectly illustrates why the Big Y SNP results were so critical in establishing the backbone relationship structure. Using the two tools together is indispensable.

An additional benefit to downloading STR results is that you can sort the STR spreadsheet columns in surname order. This facilitates easily spotting all spelling variations of McNiel, including words like Niel, Neal and such that might be relevant but that you might not notice otherwise.

Creating a Big Y Spreadsheet

My McNiel cousin has 119 Big Y-700 matches.

I built a spreadsheet with the following columns facilitating sorting in a number of ways, with definitions as follows:

McNiel Big Y spreadsheet

click to enlarge

  • First Name
  • Last Name – You will want to search matches on your personal page at Family Tree DNA by this surname later, so be sure if there is a hyphenated name to enter it completely.
  • Haplogroup – You’ll want to sort by this field.
  • Convergent – A field you’ll complete when doing your analysis. Convergence is the common haplogroup in the tree shared by you and your match. In the case of the green matches above, which are color-coded on my spreadsheet to indicate the closest matches with my McNiel cousin, the convergent haplogroup is BY18350.
  • Common Tree Gen – This column is the generations on the Block Tree shown to this common haplogroup. In the example above, it’s between 9 and 14 SNP generations. I’ll show you where to gather this information.
  • Geographic Location – Can be garnered from 4 sources. No color in that cell indicates that this information came from the Earliest Known Ancestor (EKA) field in the STR matches. Blue indicates that I opened the tree and pulled the location information from that source. Orange means that someone else by the same surname whom the tester also Y DNA matches shows this location. I am very cautious when assigning orange, and it’s risky because it may not be accurate. A fourth source is to use Ancestry, MyHeritage, or another genealogical resource to identify a location if an individual provides genealogical information but no location in the EKA field. Utilizing genealogy databases is only possible if enough information is provided to make a unique identification. John Smith 1700-1750 won’t do it, but Seamus McDougal (1750-1810) married to Nelly Anderson might just work.
  • STR Match – Tells me if the Big Y match also matches on STR markers, and if so, which ones. Only the first 111 markers are used for matching. No STR match generally means the match is further back in time, but there are no hard and fast rules.
  • Big Y Match – My original goal was to combine this information with the STR match spreadsheet. If you don’t wish to combine the two, then you don’t need this column.
  • Tree – An easy way for me to keep track of which matches do and do not have a tree. Please upload or create a tree.

You can also add a spreadsheet column for comments or contact information.

McNiel Big Y profile

You will also want to click your match’s name to display their profile card, paying particular attention to the “About Me” information where people sometimes enter genealogical information. Also, scan the Ancestral Surnames where the match may enter a location for a specific surname.

Private Variants

I added additional spreadsheet columns, not shown above, for Private Variant analysis. That level of analysis is beyond what most people are interested in doing, so I’m only briefly discussing this aspect. You may want to read along, so you at least understand what you are looking at.

Clicking on Private Variants in your Big Y Results shows your variants, or mutations, that are unnamed as SNPs. When they are named, they become SNPs and are placed on the haplotree.

The reference or “normal” state for the DNA allele at that location is shown as the “Reference,” and “Genotype” is the result of the tester. Reference results are not shown for each tester, because the majority are the same. Only mutations are shown.

McNiel Big Y private variants

There are 5 Private Variants, total, for my cousin. I’ve obscured the actual variant numbers and instead typed in 111111 and 222222 for the first two as examples.

McNiel Big Y nonmatching variants

In our example, there are 6 Big Y matches, with matches one and five having the non-matching variants shown above.

Non-matching variants mean that the match, Mr. Scott, in example 1, does NOT match the tester (my cousin) on those variants.

  • If the tester (you) has no mutation, you won’t have a Private Variant shown on your Private Variant page.
  • If the tester does have a Private Variant shown, and that variant shows ON their matches list of non-matching variants, it means the match does NOT match the tester, and either has the normal reference value or a different mutation. Explained another way, if you have a mutation, and that variant is listed on your match list of Non-Matching Variants, your match does NOT match you and does NOT have the same mutation.
  • If the match does NOT have the Private Variant on their list, that means the match DOES match the tester, and they both have the same mutation, making this Private Variant a candidate to be named as a new SNP.
  • If you don’t have a Private Variant listed, but it shows in the Non-Matching Variants of your match, that means you have the reference or normal value, and they have a mutation.

In example #1, above, the tester has a mutation at variant 111111, and 111111 is shown as a Non-Matching Variant to Mr. Scott, so Mr. Scott does NOT match the tester. Mr. Scott also does NOT match the tester at locations 222222 and 444444.

In example #5, 111111 is NOT shown on the Non-Matching Variant list, so Mr. Treacy DOES match the tester.

I have a terrible time wrapping my head around the double negatives, so it’s critical that I make charts.

On the chart below, I’ve listed the tester’s private variants in an individual column each, so 111111, 222222, etc.

For each match, I’ve copy and pasted their Non-Matching Variants in a column to the right of the tester’s variants, in the lavender region. In this example, I’ve typed the example variants into separate columns for each tester so you can see the difference. Remember, a non-matching variant means they do NOT match the tester’s mutation.

McNiel private variants spreadsheet

On my normal spreadsheet where the non-matching variants don’t have individuals columns, I then search for the first variant, 111111. If the variant does appear in the list, it means that match #1 does NOT have the mutation, so I DON’T put an X in the box for match #1 under 111111.

In the example above, the only match that does NOT have 111111 on their list of Non-Matching Variants is #5, so an X IS placed in that corresponding cell. I’ve highlighted that column in yellow to indicate this is a candidate for a new SNP.

You can see that no one else has the variant, 222222, so it truly is totally private. It’s not highlighted in yellow because it’s not a candidate to be a new SNP.

Everyone shares mutation 333333, so it’s a great candidate to become a new SNP, as is 555555.

Match #6 shares the mutation at 444444, but no one else does.

This is a manual illustration of an automated process that occurs at Family Tree DNA. After Big Y matches are returned, automated software creates private variant lists of potential new haplogroups that are then reviewed internally where SNPs are evaluated, named, and placed on the tree if appropriate.

If you follow this process and discover matches, you probably don’t need to do anything, as the automated review process will likely catch up within a few days to weeks.

Big Y Matches

In the case of the McNiel line, it was exciting to discover several private variants, mutations that were not yet named SNPs, found in several matches that were candidates to be named as SNPs and placed on the Y haplotree.

Sure enough, a few days later, my McNeill cousin had a new haplogroup assignment.

Most people have at least one Private Variant, locations in which they do NOT match another tester. When several people have these same mutations, and they are high-quality reads, the Private Variant qualifies to be added to the haplotree as a SNP, a task performed at FamilyTreeDNA by Michael Sager.

If you ever have the opportunity to hear Michael speak, please do so. You can watch Michael’s presentation at Genetic Genealogy Ireland (GGI) titled “The Tree of Mankind,” on YouTube, here, compliments of Maurice Gleeson who coordinates GGI. Maurice has also written about the Gleeson Y DNA project analysis, here.

As a result of Cousin McNeill’s test, six new SNPs have been added to the Y haplotree, the tree of mankind. You can see our new haplogroup for our branch, BY18332, with an equivalent SNP, BY25938, along with three sibling branches to the left and right on the tree.

McNiel Big Y block tree 4 branch

Big Y testing not only answers genealogical questions, it advances science by building out the tree of mankind too.

The surname of the men who share the same haplogroup, R-BY18332, meaning the named SNP furthest down the tree, are McCollum and Campbell. Not what I expected. I expected to find a McNeil who does match on at least some STR markers. This is exactly why the Big Y is so critical to define the tree structure, then use STR matches to flesh it out.

Taking the Big Y-700 test provided granularity between 6 matches, shown above, who were all initially assigned to the same branch of the tree, BY18350, but were subsequently divided into 4 separate branches. My McNiel cousin is no longer equally as distant from all 6 men. We now know that our McNiel line is genetically closer on the Y chromosome to Campbell and McCollum and further distant from Murphy, Scott, McMichael, and Glass.

Not All SNP Matches are STR Matches

Not all SNP matches are also STR matches. Some relationships are too far back in time. However, in this case, while each person on the BY18350 branches matches at some STR level, only the Campbell individual matches at all STR levels.

Remember that variants (mutations) are accumulating down both respective branches of the tree at the same time, meaning one per roughly every 100 years (if 100 is the average number we want to use) for both testers. A total of 30 variants or mutations difference, an average of 15 on each branch of the tree (McNiel and their match) would suggest a common ancestor about 1500 years ago, so each Big Y match should have a common ancestor 1500 years ago or closer. At least on average, in theory.

The Big Y test match threshold is 30 variants, so if there were any more mismatches with the Campbell male, they would not have been a Big Y match, even though they have the exact same haplogroup.

Having the same haplogroup means that their terminal SNP is identical, the SNP furthest down the tree today, at least until someone matches one of them on their Private Variants (if any remain unnamed) and a new terminal SNP is assigned to one or both of them.

Mutations, and when they happen, are truly a roll of the dice. This is why viewing all of your Big Y Block Tree matches is critical, even if they don’t show on your Big Y match list. One more variant and Campbell would have not been shown as a match, yet he is actually quite close, on the same branch, and matches on all STR panels as well.

SNPs Establish the Backbone Structure

I always view the block tree first to provide a branching tree structure, then incorporate STR matches into the equation. Both can equally as important to genealogy, but haplogroup assignment is the most accurate tool, regardless of whether the two individuals match on the Big Y test, especially if the haplogroups are relatively close.

Let’s work with the Block Tree.

The Block Tree

McNIel Big Y block tree menu

Clicking on the link to the Block Tree in the Big Y results immediately displays the tester’s branch on the tree, below.

click to enlarge

On the left side are SNP generation markers. Keep in mind that approximate SNP generations are marked every 5 generations. The most recent generations are based on the number of private variants that have not yet been assigned as branches on the tree. It’s possible that when they are assigned that they will be placed upstream someplace, meaning that placement will reduce the number of early branches and perhaps increase the number of older branches.

The common haplogroup of all of the branches shown here with the upper red arrow is R-BY3344, about 15 SNP generations ago. If you’re using 100 years per SNP generation, that’s about 1500 years. If you’re using 80 years, then 1200 years ago. Some people use even fewer years for calculations.

If some of the private variants in the closer branches disappear, then the common ancestral branch may shift to closer in time.

This tree will always be approximate because some branches can never be detected. They have disappeared entirely over time when no males exist to reproduce.

Conversely, subclades have been born since a common ancestor clade whose descendants haven’t yet tested. As more people test, more clades will be discovered.

Therefore, most recent common ancestor (MRCA) haplogroup ages can only be estimated, based on who has tested and what we know today. The tree branches also vary depending on whether testers have taken the Big Y-500 or the more sensitive Big Y-700, which detects more variants. The Y haplotree is a combination of both.

Big Y-500 results will not be as granular and potentially do not position test-takers as far down the tree as Big Y-700 results would if they upgraded. You’ll need to factor that into your analysis if you’re drawing genealogical conclusions based on these results, especially close results.

You’ll note that the direct path of descent is shown above with arrows from BY3344 through the first blue box with 5 equivalent SNPS, to the next white box, our branch, with two equivalent SNPs. Our McNeil ancestor, the McCollum tester, and the Campell tester have no unresolved private variants between them, which suggests they are probably closer in time than 10 generations back. You can see that the SNP generations are pushed “up” by the neighbor variants.

Because of the fact that private variants don’t occur on a clock cycle and occur in individual lines at an unsteady rate, we must use averages.

That means that when we look further “up” the tree, clicking generation by generation on the up arrow above BY3344, the SNP generations on the left side “adjust” based on what is beneath, and unseen at that level.

The Block Tree Adjusts

Note, in the example above, BY3344 is at SNP generation 15.

Next, I clicked one generation upstream, to R-S668.

McNiel Big Y block tree S668

click to enlarge

You can see that S668 is about 21 SNP generations upstream, and now BY3344 is listed as 20 generations, not 15. You can see our branch, BY3344, but you can no longer see subclades or our matches below that branch in this view.

You can, however, see two matches that descend through S668, brother branches to BY3344, red arrows at far right.

Clicking on the up arrow one more time shows us haplogroup S673, below, and the child branches. The three child branches on which the tester has matches are shown with red arrows.

McNiel Big Y S673

click to enlarge

You’ll immediately notice that now S668 is shown at 19 SNP generations, not 20, and S673 is shown at 20. This SNP generation difference between views is a function of dealing with aggregated and averaged private variants on combined lines and causes the SNP generations to shift. This is also why I always say “about.”

As you continue to click up the tree, the shifting SNP generations continue, reminding us that we can’t truly see back in time. We can only achieve approximations, but those approximations improve as more people test, and more SNPs are named and placed in their proper places on the phylotree.

I love the Block Tree, although I wish I could see further side-to-side, allowing me to view all of the matches on one expanded tree so I can easily see their relationships to the tester, and each other.

Countries and Origins

In addition to displaying shared averaged autosomal origins of testers on a particular branch, if they have taken the Family Finder test and opted-in to sharing origins (ethnicity) results, you can also view the countries indicated by testers on that branch along with downstream branches of the tree.

McNiel Big Y countries

click to enlarge

For example, the Countries tab for S673 is shown above. I can see matches on this branch with no downstream haplogroup currently assigned, as well as cumulative results from downstream branches.

Still, I need to be able to view this information in a more linear format.

The Block Tree and spreadsheet information beautifully augment the haplotree, so let’s take a look.

The Haplotree

On your Y DNA results page, click on the “Haplotree and SNPs” link.

McNIel Big Y haplotree menu

click to enlarge

The Y haplotree will be displayed in pedigree style, quite familiar to genealogists. The SNP legend will be shown at the top of the display. In some cases, “presumed positive” results occur where coverage is lacking, back mutations or read errors are encountered. Presumed positive is based on positive SNPs further down the tree. In other words, that yellow SNP below must read positive or downstream ones wouldn’t.

McNIel Big Y pedigree descent

click to enlarge

The tester’s branch is shown with the grey bar. To the right of the haplogroup-defining SNP are listed the branch and equivalent SNP names. At far right, we see the total equivalent SNPs along with three dots that display the Country Report. I wish the haplotree also showed my matches, or at least my matching surnames, allowing me to click through. It doesn’t, so I have to return to the Big Y page or STR Matches page, or both.

I’ve starred each branch through which my McNiell cousin descends. Sibling branches are shown in grey. As you’ll recall from the Block Tree, we do have matches on those sibling branches, shown side by side with our branch.

The small numbers to the right of the haplogroup names indicate the number of downstream branches. BY18350 has three, all displayed. But looking upstream a bit, we see that DF97 has 135 downstream branches. We also have matches on several of those branches. To show those branches, simply click on the haplogroup.

The challenge for me, with 119 McNeill matches, is that I want to see a combination of the block tree, my spreadsheet information, and the haplotree. The block tree shows the names, my spreadsheet tells me on which branches to look for those matches. Many aren’t easily visible on the block tree because they are downstream on sibling branches.

Here’s where you can find and view different pieces of information.

Data and Sources STR Matches Page Big Y Matches Page Block Tree Haplogroups & SNPs Page
STR matches Yes No, but would like to see who matches at which STR levels If they have taken Big Y test, but doesn’t mean they match on Big Y matching No
SNP matches *1 Shows if STR match has common haplogroup, but not if tester matches on Big Y No, but would like to see who matches at which STR level Big Y matches and STR matches that aren’t Big Y matches are both shown No, but need this feature – see combined haplotree/ block tree
Other Haplogroup Branch Residents Yes, both estimated and tested No, use block tree or click through to profile card, would like to see haplogroup listed for Big Y matches Yes, both Big Y and STR tested, not estimated. Cannot tell if person is Big Y match or STR match, or both. No individuals, but would like that as part of countries report, see combined haplotree/block tree
Fully Expanded Phylotree No No Would like ability to see all branches with whom any Big Y or STR match resides at one time, even if it requires scrolling Yes, but no match information. Matches report could be added like on Block Tree.
Averaged Ethnicities if Have FF Test No No Yes, by haplogroup branch No
Countries Matches map STR only No, need Big Y matches map Yes Yes
Earliest Known Ancestor Yes No, but can click through to profile card No No
Customer Trees Yes No, need this link No No
Profile Card Yes, click through Yes, click through Yes, click through No match info on this page
Downloadable data By STR panel only, would like complete download with 1 click, also if Big Y or FF match Not available at all No No
Path to common haplogroup No No, but would like to see matches haplogroup and convergent haplogroup displayed No, would like the path to convergent haplogroup displayed as an option No, see combined match-block -haplotree in next section

*1 – the best way to see the haplogroup of a Big Y match is to click on their name to view their profile card since haplogroup is not displayed on the Big Y match page. If you happen to also match on STRs, their haplogroup is shown there as well. You can also search for their name using the block tree search function to view their haplogroup.

Necessity being the mother of invention, I created a combined match/block tree/haplotree.

And I really, REALLY hope Family Tree DNA implements something like this because, trust me, this was NOT fun! However, now that it’s done, it is extremely useful. With fewer matches, it should be a breeze.

Here are the steps to create the combined reference tree.

Combo Match/Block/Haplotree

I used Snagit to grab screenshots of the various portions of the haplotree and typed the surnames of the matches in the location of our common convergent haplogroup, taken from the spreadsheet. I also added the SNP generations in red for that haplogroup, at far left, to get some idea of when that common ancestor occurred.

McNIel Big Y combo tree

click to enlarge

This is, in essence, the end-goal of this exercise. There are a few steps to gather data.

Following the path of two matches (the tester and a specific match) you can find their common haplogroup. If your match is shown on the block tree in the same view with your branch, it’s easy to see your common convergent parent haplogroup. If you can’t see the common haplogroup, it’s takes a few extra steps by clicking up the block tree, as illustrated in an earlier section.

We need the ability to click on a match and have a tree display showing both paths to the common haplogroup.

McNiel Big Y convergent

I simulated this functionality in a spreadsheet with my McNiel cousin, a Riley match, and an Ocain match whose terminal SNP is the convergent SNP (M222) between Riley and McNiel. Of course, I’d also like to be able to click to see everyone on one chart on their appropriate branches.

Combining this information onto the haplotree, in the first image, below, M222, 4 men match my McNeill cousin – 2 who show M222 as their terminal SNP, and 2 downstream of M222 on a divergent branch that isn’t our direct branch. In other words, M222 is the convergence point for all 4 men plus my McNeill cousin.

McNiel Big Y M222 haplotree

click to enlarge

In the graphic below, you can see that M222 has a very large number of equivalent SNPs, which will likely become downstream haplogroups at some point in the future. However, today, these equivalent SNPs push M222 from 25 generations to 59. We’ll discuss how this meshes with known history in a minute.

McNiel Big Y M222 block tree

click to enlarge

Two men, Ocain and Ransom, who have both taken the Big Y, whose terminal SNP is M222, match my McNiel cousin. If their common ancestor was actually 59 generations in the past, it’s very, very unlikely that they would match at all given the 30 mutation threshold.

On my reconstructed Match/Block/Haplotree, I included the estimated SNP generations as well. We are starting with the most distant haplogroups and working our way forward in time with the graphics, below.

Make no mistake, there are thousands more men who descend from M222 that have tested, but all of those men except 4 have more than 30 mutations total, so they are not shown as Big Y matches, and they are not shown individually on the Block Tree because they neither match on the Big Y or STR tests. However, there is a way to view information for non-matching men who test positive for M222.

McNiel Big Y M222 countries

click to enlarge

Looking at the Block Tree for M222, many STR match men took a SNP test only to confirm M222, so they would be shown positive for the M222 SNP on STR results and, therefore, in the detailed view of M222 on the Block tree.

Haplogroup information about men who took the M222 test and whom the tester doesn’t match at all are shown here as well in the country and branch totals for R-M222. Their names aren’t displayed because they don’t match the tester on either type of Y DNA test.

Back to constructing my combined tree, I’ve left S658 in both images, above and below, as an overlap placeholder, as we move further down, or towards current, on the haplotree.

McNiel Big Y combo tree center

click to enlarge

Note that BY18350, above, is also an overlap connecting below.

You’ll recall that as a result of the Big Y test, BY18350 was split and now has three child branches plus one person whose terminal SNP is BY18350. All of the men shown below were on one branch until Big Y results revealed that BY18350 needed to be split, with multiple new haplogroups added to the tree.

McNiel Big Y combo tree current

click to enlarge

Using this combination of tools, it’s straightforward for me to see now that our McNiel line is closest to the Campbell tester from Scotland according to the Big Y test + STRs.

Equal according to the Big Y test, but slightly more distant, according to STR matching, is McCollum. The next closest would be sibling branches. Then in the parent group of the other three, BY18350, we find Glass from Scotland.

In BY18350 and subgroups, we find several Scotland locations and one Northern Ireland, which was likely from Scotland initially, given the surname and Ulster Plantation era.

The next upstream parent haplogroup is BY3344, which looks to be weighted towards ancestors from Scotland, shown on the country card, below.

McNiel Big Y BY3344

click to enlarge

This suggests that the origins of the McNiel line was, perhaps, in Scotland, but it doesn’t tell us whether or not George and presumably, Thomas, immigrated from Ireland or Scotland.

This combined tree, with SNPs, surnames from Big Y matches, along with Country information, allows me to see who is really more closely related and who is further away.

What I didn’t do, and probably should, is to add in all of the STR matches who have taken the Big Y test, shown on their convergent branch – but that’s just beyond the scope of time I’m willing to invest, at least for now, given that hundreds of STR matches have taken the Big Y test, and the work of building the combined tree is all manual today.

For those reading this article without access to the Y phylogenetic tree, there’s a public version of the Y and mitochondrial phylotrees available, here.

What About Those McNiels?

No other known McNiel descendants from either Thomas or George have taken the Big Y test, so I didn’t expect any to match, but I am interested in other men by similar surnames. Does ANY other McNiel have a Big Y match?

As it turns out, there are two, plus one STR match who took a Big Y test, but is not a Big Y match.

However, as you can see on the combined match/block/haplotree, above, the closest other Big Y-matching McNeil male is found at about 19 SNP generations, or roughly 1900 years ago. Even if you remove some of the variants in the lower generations that are based on an average number of individual variants, you’re still about 1200 years in the past. It’s extremely doubtful that any surname would survive in both lines from the year 800 or so.

That McNeil tester’s ancestor was born in 1747 in Tranent, Scotland.

The second Big Y-matching person is an O’Neil, a few branches further up in the tree.

The convergent SNP of the two branches, meaning O’Neil and McNeill are at approximately the 21 generation level. The O’Neil man’s Neill ancestor is found in 1843 in Cookestown, County Tyrone, Ireland.

McNiel Big Y convergent McNeil lines

I created a spreadsheet showing convergent lines:

  • The McNeill man with haplogroup A4697 (ancestor Tranent, Scotland) is clearly closest genetically.
  • O’Neill BY91591, who is brother clades with Neel and Neal, all Irish, is another Big Y match.
  • The McNeill man with haplogroup FT91182 is an STR match, but not a Big Y match.

The convergent haplogroup of all of these men is DF105 at about the 22 SNP generation marker.

STRs

Let’s turn back to STR tests, with results that produce matches closer in time.

Searching my STR download spreadsheet for similar surnames, I discovered several surname matches, mining the Earliest Known Ancestor information, profiles and trees produced data as follows:

Ancestor STR Match Level Location
George Charles Neil 12, 25, match on Big Y A4697 1747-1814 Tranent, Scotland
Hugh McNeil 25 (tested at 67) Born 1800 Country Antrim, Northern Ireland
Duncan McNeill 12 (tested at 111) Married 1789, Argyllshire, Scotland
William McNeill 12, 25 (tested at 37) Blackbraes, Stirlingshire, Scotland
William McNiel 25 (tested at 67) Born 1832 Scotland
Patrick McNiel 25 (tested at 111) Trien East, County Roscommon, Ireland
Daniel McNeill 25 (tested at 67) Born 1764 Londonderry, Northern Ireland
McNeil 12 (tested at 67) 1800 Ireland
McNeill (2 matches) 25 (tested Big Y-  SNP FT91182) 1810, Antrim, Northern Ireland
Neal 25 – (tested Big Y, SNP BY146184) Antrim, Northern Ireland
Neel (2 matches) 67 (tested at 111, and Big Y) 1750 Ireland, Northern Ireland

Our best clue that includes a Big Y and STR match is a descendant of George Charles Neil born in Tranent, Scotland, in 1747.

Perhaps our second-best clue comes in the form of a 111 marker match to a descendant of one Thomas McNeil who appears in records as early as 1753 and died in 1761 In Rombout Precinct, Dutchess County, NY where his son John was born. This line and another match at a lower level both reportedly track back to early New Hampshire in the 1600s.

The MacNeil DNA Project tells us the following:

Participant 106370 descends from Isaiah McNeil b. 14 May 1786 Schaghticoke, Rensselaer Co. NY and d. 28 Aug 1855 Poughkeepsie, Dutchess Co., NY, who married Alida VanSchoonhoven.

Isaiah’s parents were John McNeal, baptized 21 Jun 1761 Rombout, Dutchess Co., NY, d. 15 Feb 1820 Stillwater, Saratoga Co., NY and Helena Van De Bogart.

John’s parents were Thomas McNeal, b.c. 1725, d. 14 Aug 1761 NY and Rachel Haff.

Thomas’s parents were John McNeal Jr., b. around 1700, d. 1762 Wallkill, Orange Co., NY (now Ulster Co. formed 1683) and Martha Borland.

John’s parents were John McNeal Sr. and ? From. It appears that John Sr. and his family were this participant’s first generation of Americans.

Searching this line on Ancestry, I discovered additional information that, if accurate, may be relevant. This lineage, if correct, and it may not be, possibly reaching back to Edinburgh, Scotland. While the information gathered from Ancestry trees is certainly not compelling in and of itself, it provides a place to begin research.

Unfortunately, based on matches shown on the MacNeil DNA Project public page, STR marker mutations for kits 30279, B78471 and 417040 when compared to others don’t aid in clustering or indicating which men might be related to this group more closely than others using line-marker mutations.

Matches Map

Let’s take a look at what the STR Matches Map tells us.

McNiel Big Y matches map menu

This 67 marker Matches Map shows the locations of the earliest known ancestors of STR matches who have entered location information.

McNiel Big Y matches mapMcNiel Big Y matches map legend

My McNeill cousin’s closest matches are scattered with no clear cluster pattern.

Unfortunately, there is no corresponding map for Big Y matches.

SNP Map

The SNP map provided under the Y DNA results allows testers to view the locations where specific haplogroups are found.

McNiel Big Y SNP map

The SNP map marks an area where at least two or more people have claimed their most distant known ancestor to be. The cluster size is the maximum amount of miles between people that is allowed in order for a marker indicating a cluster at a location to appear. So for example, the sample size is at least 2 people who have tested, and listed their most distant known ancestor, the cluster is the radius those two people can be found in. So, if you have 10 red dots, that means in 1000 miles there are 10 clusters of at least two people for that particular SNP. Note that these locations do NOT include people who have tested positive for downstream locations, although it does include people who have taken individual SNP tests.

Working my way from the McNiel haplogroup backward in time on the SNP map, neither BY18332 nor BY18350 have enough people who’ve tested, or they didn’t provide a location.

Moving to the next haplogroup up the tree, two clusters are formed for BY3344, shown below.

McNIel Big Y BY3344 map

S668, below.

McNiel Big Y S668 map

It’s interesting that one cluster includes Glasgow.

S673, below.

McNiel Big Y S673 map

DF85, below:

McNiel Big Y DF85 map

DF105 below:

McNiel BIg Y DF105 map

M222, below:

McNiel Big Y M222 map

For R-M222, I’ve cropped the locations beyond Ireland and Scotland. Clearly, RM222 is the most prevalent in Ireland, followed by Scotland. Wherever M222 originated, it has saturated Ireland and spread widely in Scotland as well.

R-M222

R-M222, the SNP initially thought to indicate Niall of the 9 Hostages, occurred roughly 25-59 SNP generations in the past. If this age is even remotely accurate, averaging by 80 years per generation often utilized for Big Y results, produces an age of 2000 – 4720 years. I find it extremely difficult to believe any semblance of a surname survived that long. Even if you reduce the time in the past to the historical narrative, roughly the year 400, 1600 years, I still have a difficult time believing the McNiel surname is a result of being a descendant of Niall of the 9 Hostages directly, although oral history does have staying power, especially in a clan setting where clan membership confers an advantage.

Surname or not, clearly, our line along with the others whom we match on the Big Y do descend from a prolific common ancestor. It’s very unlikely that the mutation occurred in Niall’s generation, and much more likely that other men carried M222 and shared a common ancestor with Niall at some point in the distant past.

McNiel Conclusion – Is There One?

If I had two McNiel wishes, they would be:

  • Finding records someplace in Virginia that connect George and presumably brothers Thomas and John to their parents.
  • A McNiel male from wherever our McNiel line originated becoming inspired to Y DNA test. Finding a male from the homeland might point the way to records in which I could potentially find baptismal records for George about 1720 and Thomas about 1724, along with possibly John, if he existed.

I remain hopeful for a McNiel from Edinburgh, or perhaps Glasgow.

I feel reasonably confident that our line originated genetically in Scotland. That likely precludes Niall of the 9 Hostages as a direct ancestor, but perhaps not. Certainly, one of his descendants could have crossed the channel to Scotland. Or, perhaps, our common ancestor is further back in time. Based on the maps, it’s clear that M222 saturates Ireland and is found widely in Scotland as well.

A great deal depends on the actual age of M222 and where it originated. Certainly, Niall had ancestors too, and the Ui Neill dynasty reaches further back, genetically, than their recorded history in Ireland. Given the density of M222 and spread, it’s very likely that M222 did, in fact, originate in Ireland or, alternatively, very early in Scotland and proliferated in Ireland.

If the Ui Neill dynasty was represented in the persona of the High King, Niall of the 9 Hostages, 1600 years ago, his M222 ancestors were clearly inhabiting Ireland earlier.

We may not be descended from Niall personally, but we are assuredly related to him, sharing a common ancestor sometime back in the prehistory of Ireland and Scotland. That man would sire most of the Irish men today and clearly, many Scots as well.

Our ancestors, whoever they were, were indeed in Ireland millennia ago. R-M222, our ancestor, was the ancestor of the Ui Neill dynasty and of our own Reverend George McNiel.

Our ancestors may have been at Knowth and New Grange, and yes, perhaps even at Tara.

Tara Niall mound in sun

Someplace in the mists of history, one man made a different choice, perhaps paddling across the channel, never to return, resulting in M222 descendants being found in Scotland. His descendants include our McNeil ancestors, who still slumber someplace, awaiting discovery.

_____________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Products and Services

Genealogy Research

Genetic Affairs: AutoPedigree Combines AutoTree with WATO to Identify Your Potential Tree Locations

July 2020 Update: Please note that Ancestry issues a cease-and-desist order against Genetic Affairs, and this tool no longer works at Ancestry. The great news is that it still works at the other vendors, and you can ask Ancestry matches to transfer, which is free.

If you’re an adoptee or searching for an unknown parent or ancestor, AutoPedigree is just what you’ve been waiting for.

By now, we’re all familiar with Genetic Affairs who launched in 2018 with their signature autocluster tool. AutoCluster groups your matches into clusters by who your matches match with each other, in addition to you.

browser autocluster

A year later, in December 2019, Genetic Affairs introduced AutoTree, automated tree reconstruction based on your matches trees at Ancestry and Family Finder at Family Tree DNA, even if you don’t have a tree.

Now, Genetic Affairs has introduced AutoPedigree, a combination of the AutoTree reconstruction technology combined with WATO, What Are the Odds, as seen here at DNAPainter. WATO is a statistical probability technique developed by the DNAGeek that allows users to review possible positions in a tree for where they best fit.

Here’s the progressive functionality of how the three Genetic Affairs tools, combined, function:

  • AutoCluster groups people based on if they match you and each other
  • AutoTree finds common ancestors for trees from each cluster
  • Next, AutoTree finds the trees of all matches combined, including from trees of your DNA matches not in clusters
  • AutoPedigree checks to see if a common ancestor tree meets the minimum requirement which is (at least) 3 matches of greater to or equal to 30-40 cM. If yes, an AutoPedigree with hypotheses is created based on the common ancestor of the matching people.
  • Combined AutoPedigrees then reviews all AutoTrees and AutoPedigrees that have common ancestors and combine them into larger trees.

Let’s look at examples, beginning with DNAPainter who first implemented a form of WATO.

DNA Painter

Let’s say you’re trying to figure out how you’re related to a group of people who descend from a specific ancestral couple. This is particularly useful for someone seeking unknown parents or other unknown relationships.

DNA tools are always from the perspective of the tester, the person whose kit is being utilized.

At DNAPainter, you manually create the pedigree chart beginning with a common couple and creating branches to all of their descendants that you match.

This example at DNAPainter shows the matches with their cM amounts in yellow boxes.

xAutoPedigree DNAPainter WATO2

The tester doesn’t know where they fit in this pedigree chart, so they add other known lines and create hypothesis placeholder possibilities in light blue.

In other words, if you’re searching for your mother and you were born in 1970, you know that your mother was likely born between 1925 (if she was 45 when she gave birth to you) and 1955 (if she was 15 when she gave birth to you.) Therefore, in the family you create, you’d search for parents who could have given birth to children during those years and create hypothetical children in those tree locations.

The WATO tool then utilizes the combination of expected cMs at that position to create scores for each hypothesis position based on how closely or distantly you match other members of that extended family.

The Shared cM Project, created and recently updated by Blaine Bettinger is used as the foundation for the expected centimorgan (cM) ranges of each relationship. DNAPainter has automated the possible relationships for any given matching cM amount, here.

In the graphic above, you can see that the best hypothesis is #2 with a score of 1, followed by #4 and #5 with scores of 3 each. Hypothesis 1 has a score of 63.8979 and hypothesis 3 has a score of 383.

You’ll need to scroll to the bottom to determine which of the various hypothesis are the more likely.

Autopedigree DNAPainter calculated probability

Using DNAPainter’s WATO implementation requires you to create the pedigree tree to test the hypothesis. The benefit of this is that you can construct the actual pedigree as known based on genealogical research. The down-side, of course, is that you have to do the research to current in each line to be able to create the pedigree accurately, and that’s a long and sometimes difficult manual process.

Genetic Affairs and WATO

Genetic Affairs takes a different approach to WATO. Genetic Affairs removes the need for hand entry by scanning your matches at Ancestry and Family Tree DNA, automatically creating pedigrees based on your matches’ trees. In addition, Genetic Affairs automatically creates multiple hypotheses. You may need to utilize both approaches, meaning Genetic Affairs and DNAPainter, depending on who has tested, tree completeness at the vendors, and other factors.

The great news is that you can import the Genetic Affairs reconstructed trees into DNAPainter’s WATO tool instead of creating the pedigrees from scratch. Of course, Genetic Affairs can only use the trees someone has entered. You, on the other hand, can create a more complete tree at DNAPainter.

Combining the two tools leverages the unique and best features of both.

Genetic Affairs AutoPedigree Options

Recently, Genetic Affairs released AutoPedigree, their new tool that utilizes the reconstructed AutoTrees+WATO to place the tester in the most likely region or locations in the reconstructed tree.

Let’s take a look at an example. I’m using my own kit to see what kind of results and hypotheses exist for where I fit in the tree reconstructed from my matches and their trees.

If you actually do have a tree, the AutoTree portion will simply be counted as an equal tree to everyone else’s trees, but AutoPedigree will ignore your tree, creating hypotheses as if it doesn’t exist. That’s great for adoptees who may have hypothetical trees in progress, because that tree is disregarded.

First, sign on to your account at Genetic Affairs and select the AutoPedigree option for either Ancestry or Family Tree DNA which reconstructs trees and generates hypotheses automatically. For AutoPedigree construction, you cannot combine the results from Ancestry and FamilyTreeDNA like you can when reconstructing trees alone. You’ll need to do an AutoPedigree run for each vendor. The good news is that while Ancestry has more testers and matches, FamilyTreeDNA has many testers stretching back 20 years or so in the past who passed away before testing became available at Ancestry. Often, their testers reach back a generation or two further. You can easily transfer Ancestry (and other) results to Family Tree DNA for free to obtain more matches – step-by-step instructions here.

At Genetic Affairs, you should also consider including half-relations, especially if you are dealing with an unknown parent situation. Selecting half-relationships generates very large trees, so you might want to do the first run without, then a second run with half relationships selected.

AutoPedigree options

Results

I ran the program and opened the resulting email with the zip file. Saving that file automatically unzips for me, displaying the following 5 files and folders.

Autopedigree cluster

Clicking on the AutoCluster HTML link reveals the now-familiar clusters, shown below.

Autopedigree clusters

I have a total of 26 clusters, only partially shown above. My first peach cluster and my 9th blue cluster are huge.

Autopedigree 26 clusters

That’s great news because it means that I have a lot to work with.

autopedigree folder

Next, you’ll want to click to open your AutoPedigree folder.

For each cluster, you’ll have a corresponding AutoPedigree file if an AutoPedigree can be generated from the trees of the people in that cluster.

My first cluster is simply too large to show successfully in blog format, so I’m selecting a smaller cluster, #21, shown below with the red arrow, with only 6 members. Why so small, you ask? In part, because I want to illustrate the fact that you really don’t need a lot of matches for the AutoPedigree tool to be useful.

Autopedigree multiple clusters

Note also that this entire group of clusters (blue through brown) has members in more than one cluster, indicated by the grey cells that mean someone is a member of at least 2 clusters. That tells me that I need to include the information from those clusters too in my analysis. Fortunately, Genetic Affairs realizes that and provides a combined AutoPedigree tool for that as well, which we will cover later in the article. Just note for now that the blue through brown clusters seem to be related to cluster 21.

Let’s look at cluster 21.

autopedigree cluster 21

In the AutoPedigree folder, you’ll see cluster files when there are trees available to create pedigrees for individual clusters. If you’re lucky, you’ll find 2 files for some clusters.

autopedigree ancestors

At the top of each cluster AutoPedigree file, Genetic Affairs shows you the home couple of the descendant group shown in the matches and their corresponding trees.

Autopedigree WATO chart

Image 1 – click to enlarge

I don’t expect you to be able to read everything in the above pedigree chart, just note the matches and arrows.

You can see three of my cousins who match, labeled with “Ancestry.” You also see branches that generate a viable hypothesis. When generating AutoPedigrees, Genetic Affairs truncates any branches that cannot result in a viable hypothesis for placing the tester in a viable location on the tree, so you may not see all matches.

Autopedigree hyp 1

Image 2 – click to enlarge

On the top branch, you’ll see hyp-1-child1 which is the first hypothesis, with the first child. Their child is hyp-2- child2, and their child is hyp-3-child3. The tester (me, in this case) cannot be the persons shown with red flags, called badges, based on how I match other people and other tree information such as birth and death dates.

Think of a stoplight, red=no, green are your best bets and the rest are yellow, meaning maybe. AutoPedigree makes no decisions, only shows you options, and calculated mathematically how probable each location is to be correct.

Remember, these “children,” meaning hypothesis 1-child 1 may or may not have actually existed. These relationships are hypothetical showing you that IF these people existed, where the tester could appear on the tree.

We know that I don’t fit on the branch above hypothesis 1, because I only match the descendant of Adam Lentz at 44.2 cM which is statistically too low for me to also inhabit that branch.

I’ve included half relationships, so we see hyp-7-child1-half too, which is a half-sibling.

The rankings for hypotheses 1, 2, and 7 all have red badges, meaning not possible, so they have a score of 0. Hypothesis 3 and 8 are possible, with a ranking of 16, respectively.

autopedigree my location

Image 3 – click to enlarge

Looking now at the next segment of the tree, you see that based on how I match my Deatsman and Hartman cousins, I can potentially fit in any portion of the tree with green badges (in the red boxes) or yellow badges.

You can also see where I actually fit in the tree. HOWEVER, that placement is from AutoTree, the tree reconstruction portion, based on the fact that I have a tree (or someone has a tree with me in it). My own tree is ignored for hypothesis generation for the AutoPedigree hypothesis generation portion.

Had my first cousins once removed through my grandfather John Ferverda’s brother, Roscoe, tested AND HAD A TREE, there would have been no question where I fit based on how I match them.

autopedigree cousins

As it turns out they did test, but provided no tree meaning that Genetic Affairs had no tree to work with.

Remember that I mentioned that my first cluster was huge. Many more matches mean that Genetic Affairs has more to work with. From that cluster, here’s an example of a hypothesis being accurate.

autopedigree correct

Image 4 – click to enlarge

You can see the hypothetical line beneath my own line, with hypothesis 104, 105, 106, 107, 108. The AutoTree portion of my tree is shown above, with my father and grandparents and my name in the green block. The AutoPedigree portion ignores my own tree, therefore generating the hypothesis that’s where I could fit with a rank of 2. And yes, that’s exactly where I fit in the tree.

In this case, there were some hypotheses ranked at 1, but they were incorrect, so be sure to evaluate all good (green) options, then yellow, in that order.

Genetic Affairs cannot work with 23andMe results for AutoPedigree because 23andMe doesn’t provide or support trees on their site. AutoClusters are integrated at MyHeritage, but not the AutoTree or AutoPedigree functions, and they cannot be run separately.

That leaves Family Tree DNA and Ancestry.

Combined AutoPedigree

After evaluating each of the AutoPedigrees generated for each cluster for which an AutoPedigree can be generated, click on the various cluster combined autopedigrees.

autopedigree combined

You can see that for cluster 1, I have 7 separate AutoPedigrees based on common ancestors that were different. I have 3 AutoPedigrees also for cluster 9, and 2 AutoPedigrees for 15, 21, and 24.

I have no AutoPedigrees for clusters 2, 3, 5, 6, 7, 8, 14, 17, 18, and 22.

Moving to the combined clusters, the numbers of which are NOT correlated to the clusters themselves, Genetic Affairs has searched trees and combined ancestors in various clusters together when common ancestors were found.

Autopedigree multiple clusters

Remember that I asked you to note that the above blue through brown clusters seem to have commonality between the clusters based on grey cell matches who are found in multiple groups? In fact, these people do share common ancestors, with a large combined AutoPedigree being generated from those multiple clusters.

I know you can’t read the tree in the image that follows. I’m only including it so you’ll see the scale of that portion of my tree that can be reconstructed from my matches with hypotheses of where I fit.

autopedigree huge

Image 5 – click to enlarge

These larger combined pedigrees are very useful to tie the clusters together and understand how you match numerous people who descend from the same larger ancestral group, further back in time.

Integration with DNAPainter

autopedigree wato file

Each AutoPedigree file and combined cluster AutoPedigree file in the AutoPedigree folder is provided in WATO format, allowing you to import them into DNAPainter’s WATO tool.

autopedigree dnapainter import

You can manually flesh out the trees based on actual genealogy in WATO at DNAPainter, manually add matches from GEDmatch, 23andMe or MyHeritage or matches from vendors where your matches trees may not exist but you know how your match connects to you.

Your AutoTree Ancestors

But wait, there’s more.

autopedigree ancestors folder

If you click on the Ancestors folder, you’ll see 5 options for tree generations 3-7.

autopedigree ancestor generations

My three-generation auto-generated reconstructed tree looks like this:

autopedigree my tree

Selecting the 5th generation level displays Jacob Lentz and Frederica Ruhle, the couple shown in the AutoCluster 21 and AutoPedigree examples earlier. The color-coding indicates the source of the ancestors in that position.

Autopedigree expanded tree

click to enlarge

You will also note that Genetic Affairs indicates how many matches I have that share this common ancestor along with which clusters to view for matches relevant to specific ancestors. How cool is this?!!

Remember that you can also import the genetic match information for each AutoTree cluster found at Family Tree DNA into DNAPainter to paint those matches on your chromosomes using DNAPainter’s Cluster Auto Painter.

If you run AutoCluster for matches at 23andMe, MyHeritage, or FamilyTreeDNA, all vendors who provide segment information, you can also import that cluster segment information into DNAPainter for chromosome painting.

However, from that list of vendors, you can only generate AutoTrees and AutoPedigrees at Family Tree DNA. Given this, it’s in your best interest for your matches to test at or upload their DNA (plus tree) to Family Tree DNA who supports trees AND provides segment information, both, and where you can run AutoTree and AutoPedigree.

Have you painted your clusters or generated AutoTrees? If you’re an adoptee or looking for an unknown parent or grandparent, the new AutoPedigree function is exactly what you need.

Documentation

Genetic Affairs provides complete instructions for AutoPedigree in this newsletter, along with a user manual here, and the Facebook Genetic Affairs User Group can be found here.

I wrote the introductory article, AutoClustering by Genetic Affairs, here, and Genetic Affairs Reconstructs Trees from Genetic Clusters – Even Without Your Tree or Common Ancestors, here. You can read about DNAPainter, here.

Transfer your DNA file, for free, from Ancestry to Family Tree DNA or MyHeritage, by following the easy instructions, here.

Have fun! Your ancestors are waiting.

_____________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Products and Services

Genealogy Research