Genetic Genealogy at 20 Years: Where Have We Been, Where Are We Going and What’s Important?

Not only have we put 2020 in the rear-view mirror, thankfully, we’re at the 20-year, two-decade milestone. The point at which genetics was first added to the toolbox of genealogists.

It seems both like yesterday and forever ago. And yes, I’ve been here the whole time,  as a spectator, researcher, and active participant.

Let’s put this in perspective. On New Year’s Eve, right at midnight, in 2005, I was able to score kit number 50,000 at Family Tree DNA. I remember this because it seemed like such a bizarre thing to be doing at midnight on New Year’s Eve. But hey, we genealogists are what we are.

I knew that momentous kit number which seemed just HUGE at the time was on the threshold of being sold, because I had inadvertently purchased kit 49,997 a few minutes earlier.

Somehow kit 50,000 seemed like such a huge milestone, a landmark – so I quickly bought kits, 49,998, 49,999, and then…would I get it…YES…kit 50,000. Score!

That meant that in the 5 years FamilyTreeDNA had been in business, they had sold on an average of 10,000 kits per year, or 27 kits a day. Today, that’s a rounding error. Then it was momentous!

In reality, the sales were ramping up quickly, because very few kits were sold in 2000, and roughly 20,000 kits had been sold in 2005 alone. I know this because I purchased kit 28,429 during the holiday sale a year earlier.

Of course, I had no idea who I’d test with that momentous New Year’s Eve Y DNA kit, but I assuredly would find someone. A few months later, I embarked on a road trip to visit an elderly family member with that kit in tow. Thank goodness I did, and they agreed and swabbed on the spot, because they are gone today and with them, the story of the Y line and autosomal DNA of their branch.

In the past two decades, almost an entire generation has slipped away, and with them, an entire genealogical library held in their DNA.

Today, more than 40 million people have tested with the four major DNA testing companies, although we don’t know exactly how many.

Lots of people have had more time to focus on genealogy in 2020, so let’s take a look at what’s important? What’s going on and what matters beyond this month or year?

How has this industry changed in the last two decades, and where it is going?

Reflection

This seems like a good point to reflect a bit.

Professor Dan Bradley reflecting on early genetic research techniques in his lab at the Smurfit Institute of Genetics at Trinity College in Dublin. Photo by Roberta Estes

In the beginning – twenty years ago, there were two companies who stuck their toes in the consumer DNA testing water – Oxford Ancestors and Family Tree DNA. About the same time, Sorenson Genomics and GeneTree were also entering that space, although Sorenson was a nonprofit. Today, of those, only FamilyTreeDNA remains, having adapted with the changing times – adding more products, testing, and sophistication.

Bryan Sykes who founded Oxford Ancestors announced in 2018 that he was retiring to live abroad and subsequently passed away in 2020. The website still exists, but the company has announced that they have ceased sales and the database will remain open until Sept 30, 2021.

James Sorenson died in 2008 and the assets of Sorenson Molecular Genealogy Foundation, including the Sorenson database, were sold to Ancestry in 2012. Eventually, Ancestry removed the public database in 2015.

Ancestry dabbled in Y and mtDNA for a while, too, destroying that database in 2014.

Other companies, too many to remember or mention, have come and gone as well. Some of the various company names have been recycled or purchased, but aren’t the same companies today.

In the DNA space, it was keep up, change, die or be sold. Of course, there was the small matter of being able to sell enough DNA kits to make enough money to stay in business at all. DNA processing equipment and a lab are expensive. Not just the equipment, but also the expertise.

The Next Wave

As time moved forward, new players entered the landscape, comprising the “Big 4” testing companies that constitute the ponds where genealogists fish today.

23andMe was the first to introduce autosomal DNA testing and matching. Their goal and focus was always medical genetics, but they recognized the potential in genealogists before anyone else, and we flocked to purchase tests.

Ancestry settled on autosomal only and relies on the size of their database, a large body of genealogy subscribers, and a widespread “feel-good” marketing campaign to sell DNA kits as the gateway to “discover who you are.”

FamilyTreeDNA did and still does offer all 3 kinds of tests. Over the years, they have enhanced both the Y DNA and mitochondrial product offerings significantly and are still known as “the science company.” They are the only company to offer the full range of Y DNA tests, including their flagship Big Y-700, full sequence mitochondrial testing along with matching for both products. Their autosomal product is called Family Finder.

MyHeritage entered the DNA testing space a few years after the others as the dark horse that few expected to be successful – but they fooled everyone. They have acquired companies and partnered along the way which allowed them to add customers (Promethease) and tools (such as AutoCluster by Genetic Affairs), boosting their number of users. Of course, MyHeritage also offers users a records research subscription service that you can try for free.

In summary:

One of the wonderful things that happened was that some vendors began to accept compatible raw DNA autosomal data transfer files from other vendors. Today, FamilyTreeDNA, MyHeritage, and GEDmatch DO accept transfer files, while Ancestry and 23andMe do not.

The transfers and matching are free, but there are either minimal unlock or subscription plans for advanced features.

There are other testing companies, some with niche markets and others not so reputable. For this article, I’m focusing on the primary DNA testing companies that are useful for genealogy and mainstream companion third-party tools that complement and enhance those services.

The Single Biggest Change

As I look back, the single biggest change is that genetic genealogy evolved from the pariah of genealogy where DNA discussion was banned from the (now defunct) Rootsweb lists and summarily deleted for the first few years after introduction. I know, that’s hard to believe today.

Why, you ask?

Reasons varied from “just because” to “DNA is cheating” and then morphed into “because DNA might do terrible things like, maybe, suggest that a person really wasn’t related to an ancestor in a lineage society.”

Bottom line – fear and misunderstanding. Change is exceedingly difficult for humans, and DNA definitely moved the genealogy cheese.

From that awkward beginning, genetic genealogy organically became a “thing,” a specific application of genealogy. There was paper-trail traditional genealogy and then the genetic aspect. Today, for almost everyone, genealogy is “just another tool” in the genealogist’s toolbox, although it does require focused learning, just like any other tool.

DNA isn’t separate anymore, but is now an integral part of the genealogical whole. Having said that, DNA can’t solve all problems or answer all questions, but neither can traditional paper-trail genealogy. Together, each makes the other stronger and solves mysteries that neither can resolve alone.

Synergy.

I fully believe that we have still only scratched the surface of what’s possible.

Inheritance

As we talk about the various types of DNA testing and tools, here’s a quick graphic to remind you of how the different types of DNA are inherited.

  • Y DNA is inherited paternally for males only and informs us of the direct patrilineal (surname) line.
  • Mitochondrial DNA is inherited by everyone from their mothers and informs us of the mother’s matrilineal (mother’s mother’s mother’s) line.
  • Autosomal DNA can be inherited from potentially any ancestor in random but somewhat predictable amounts through both parents. The further back in time, the less identifiable DNA you’ll inherit from any specific ancestor. I wrote about that, here.

What’s Hot and What’s Not

Where should we be focused today and where is this industry going? What tools and articles popped up in 2020 to help further our genealogy addiction? I already published the most popular articles of 2020, here.

This industry started two decades ago with testing a few Y DNA and mitochondrial DNA markers, and we were utterly thrilled at the time. Both tests have advanced significantly and the prices have dropped like a stone. My first mitochondrial DNA test that tested only 400 locations cost more than $800 – back then.

Y DNA and mitochondrial DNA are still critically important to genetic genealogy. Both play unique roles and provide information that cannot be obtained through autosomal DNA testing. Today, relative to Y DNA and mitochondrial DNA, the biggest challenge, ironically, is educating newer genealogists about their potential who have never heard about anything other than autosomal, often ethnicity, testing.

We have to educate in order to overcome the cacophony of “don’t bother because you don’t get as many matches.”

That’s like saying “don’t use the right size wrench because the last one didn’t fit and it’s a bother to reach into the toolbox.” Not to mention that if everyone tested, there would be a lot more matches, but I digress.

If you don’t use the right tool, and all of the tools at your disposal, you’re not going to get the best result possible.

The genealogical proof standard, the gold standard for genealogy research, calls for “a reasonably exhaustive search,” and if you haven’t at least considered if or how Y
DNA
and mitochondrial DNA along with autosomal testing can or might help, then your search is not yet exhaustive.

I attempt to obtain the Y and mitochondrial DNA of every ancestral line. In the article, Search Techniques for Y and Mitochondrial DNA Test Candidates, I described several methodologies to find appropriate testing candidates.

Y DNA – 20 Years and Still Critically Important

Y DNA tracks the Y chromosome for males via the patrilineal (surname) line, providing matching and historical migration information.

We started 20 years ago testing 10 STR markers. Today, we begin at 37 markers, can upgrade to 67 or 111, but the preferred test is the Big Y which provides results for 700+ STR markers plus results from the entire gold standard region of the Y chromosome in order to provide the most refined results. This allows genealogists to use STR markers and SNP results together for various aspects of genealogy.

I created a Y DNA resource page, here, in order to provide a repository for Y DNA information and updates in one place. I would encourage anyone who can to order or upgrade to the Big Y-700 test which provides critical lineage information in addition to and beyond traditional STR testing. Additionally, the Big Y-700 test helps build the Y DNA haplotree which is growing by leaps and bounds.

More new SNPs are found and named EVERY SINGLE DAY today at FamilyTreeDNA than were named in the first several years combined. The 2006 SNP tree listed a grand total of 459 SNPs that defined the Y DNA tree at that time, according to the ISOGG Y DNA SNP tree. Goran Rundfeldt, head of R&D at FamilyTreeDNA posted this today:

2020 was an awful year in so many ways, but it was an unprecedented year for human paternal phylogenetic tree reconstruction. The FTDNA Haplotree or Great Tree of Mankind now includes:

37,534 branches with 12,696 added since 2019 – 51% growth!
defined by
349,097 SNPs with 131,820 added since 2019 – 61% growth!

In just one year, 207,536 SNPs were discovered and assigned FT SNP names. These SNPs will help define new branches and refine existing ones in the future.

The tree is constructed based on high coverage chromosome Y sequences from:
– More than 52,500 Big Y results
– Almost 4,000 NGS results from present-day anonymous men that participated in academic studies

Plus an additional 3,000 ancient DNA results from archaeological remains, of mixed quality and Y chromosome coverage at FamilyTreeDNA.

Wow, just wow.

These three new articles in 2020 will get you started on your Y DNA journey!

Mitochondrial DNA – Matrilineal Line of Humankind is Being Rewritten

The original Oxford Ancestor’s mitochondrial DNA test tested 400 locations. The original Family Tree DNA test tested around 1000 locations. Today, the full sequence mitochondrial DNA test is standard, testing the entire 16,569 locations of the mitochondria.

Mitochondrial DNA tracks your mother’s direct maternal, or matrilineal line. I’ve created a mitochondrial DNA resource page, here that includes easy step-by-step instructions for after you receive your results.

New articles in 2020 included the introduction of The Million Mito Project. 2021 should see the first results – including a paper currently in the works.

The Million Mito Project is rewriting the haplotree of womankind. The current haplotree has expanded substantially since the first handful of haplogroups thanks to thousands upon thousands of testers, but there is so much more information that can be extracted today.

Y and Mitochondrial Resources

If you don’t know of someone in your family to test for Y DNA or mitochondrial DNA for a specific ancestral line, you can always turn to the Y DNA projects at Family Tree DNA by searching here.

The search provides you with a list of projects available for a specific surname along with how many customers with that surname have tested. Looking at the individual Y DNA projects will show the earliest known ancestor of the surname line.

Another resource, WikiTree lists people who have tested for the Y DNA, mitochondrial DNA and autosomal DNA lines of specific ancestors.

Click on images to enlarge

On the left side, my maternal great-grandmother’s profile card, and on the right, my paternal great-great-grandfather. You can see that someone has tested for the mitochondrial DNA of Nora (OK, so it’s me) and the Y DNA of John Estes (definitely not me.)

MitoYDNA, a nonprofit volunteer organization created a comparison tool to replace Ysearch and Mitosearch when they bit the dust thanks to GDPR.

MitoYDNA accepts uploads from different sources and allows uploaders to not only match to each other, but to view the STR values for Y DNA and the mutation locations for the HVR1 and HVR2 regions of mitochondrial DNA. Mags Gaulden, one of the founders, explains in her article, What sets mitoYDNA apart from other DNA Databases?.

If you’ve tested at nonstandard companies, not realizing that they didn’t provide matching, or if you’ve tested at a company like Sorenson, Ancestry, and now Oxford Ancestors that is going out of business, uploading your results to mitoYDNA is a way to preserve your investment. PS – I still recommend testing at FamilyTreeDNA in order to receive detailed results and compare in their large database.

CentiMorgans – The Word of Two Decades

The world of autosomal DNA turns on the centimorgan (cM) measure. What is a centimorgan, exactly? I wrote about that unit of measure in the article Concepts – CentiMorgans, SNPs and Pickin’ Crab.

Fortunately, new tools and techniques make using cMs much easier. The Shared cM Project was updated this year, and the results incorporated into a wonderfully easy tool used to determine potential relationships at DNAPainter based on the number of shared centiMorgans.

Match quality and potential relationships are determined by the number of shared cMs, and the chromosome browser is the best tool to use for those comparisons.

Chromosome Browser – Genetics Tool to View Chromosome Matches

Chromosome browsers allow testers to view their matching cMs of DNA with other testers positioned on their own chromosomes.

My two cousins’ DNA where they match me on chromosomes 1-4, is shown above in blue and red at Family Tree DNA. It’s important to know where you match cousins, because if you match multiple cousins on the same segment, from the same side of your family (maternal or paternal), that’s suggestive of a common ancestor, with a few caveats.

Some people feel that a chromosome browser is an advanced tool, but I think it’s simply standard fare – kind of like driving a car. You need to learn how to drive initially, but after that, you don’t even think about it – you just get in and go. Here’s help learning how to drive that chromosome browser.

Triangulation – Science Plus Group DNA Matching Confirms Genealogy

The next logical step after learning to use a chromosome browser is triangulation. If fact, you’re seeing triangulation above, but don’t even realize it.

The purpose of genetic genealogy is to gather evidence to “prove” ancestral connections to either people or specific ancestors. In autosomal DNA, triangulation occurs when:

  • You match at least two other people (not close relatives)
  • On the same reasonably sized segment of DNA (generally 7 cM or greater)
  • And you can assign that segment to a common ancestor

The same two cousins are shown above, with triangulated segments bracketed at MyHeritage. I’ve identified the common ancestor with those cousins that those matching DNA segments descend from.

MyHeritage’s triangulation tool confirms by bracketing that these cousins also match each other on the same segment, which is the definition of triangulation.

I’ve written a lot about triangulation recently.

If you’d prefer a video, I recorded a “Top Tips” Facebook LIVE with MyHeritage.

Why is Ancestry missing from this list of triangulation articles? Ancestry does not offer a chromosome browser or segment information. Therefore, you can’t triangulate at Ancestry. You can, however, transfer your Ancestry DNA raw data file to either FamilyTreeDNA, MyHeritage, or GEDmatch, all three of which offer triangulation.

Step by step download/upload transfer instructions are found in this article:

Clustering Matches and Correlating Trees

Based on what we’ve seen over the past few years, we can no longer depend on the major vendors to provide all of the tools that genealogists want and need.

Of course, I would encourage you to stay with mainstream products being used by a significant number of community power users. As with anything, there is always someone out there that’s less than honorable.

2020 saw a lot of innovation and new tools introduced. Maybe that’s one good thing resulting from people being cooped up at home.

Third-party tools are making a huge difference in the world of genetic genealogy. My favorites are Genetic Affairs, their AutoCluster tool shown above, DNAPainter and DNAGedcom.

These articles should get you started with clustering.

If you like video resources, here’s a MyHeritage Facebook LIVE that I recorded about how to use AutoClusters:

I created a compiled resource article for your convenience, here:

I have not tried a newer tool, YourDNAFamily, that focuses only on 23andMe results although the creator has been a member of the genetic genealogy community for a long time.

Painting DNA Makes Chromosome Browsers and Triangulation Easy

DNAPainter takes the next step, providing a repository for all of your painted segments. In other words, DNAPainter is both a solution and a methodology for mass triangulation across all of your chromosomes.

Here’s a small group of people who match me on the same maternal segment of chromosome 1, including those two cousins in the chromosome browser and triangulation sections, above. We know that this segment descends from Philip Jacob Miller and his wife because we’ve been able to identify that couple as the most distant ancestor intersection in all of our trees.

It’s very helpful that DNAPainter has added the functionality of painting all of the maternal and paternal bucketed matches from Family Tree DNA.

All you need to do is to link your known matches to your tree in the proper place at FamilyTreeDNA, then they do the rest by using those DNA matches to indicate which of the rest of your matches are maternal and paternal. Instructions, here. You can then export the file and use it at DNAPainter to paint all of those matches on the correct maternal or paternal chromosomes.

Here’s an article providing all of the DNAPainter Instructions and Resources.

DNA Matches Plus Trees Enhance Genealogy

Of course, utilizing DNA matching plus finding common ancestors in trees is one of the primary purposes of genetic genealogy – right?

Vendors have linked the steps of matching DNA with matching ancestors in trees.

Genetic Affairs take this a step further. If you don’t have an ancestor in your tree, but your matches have common ancestors with each other, Genetic Affairs assembles those trees to provide you with those hints. Of course, that common ancestor might not be relevant to your genealogy, but it just might be too!

click to enlarge

This tree does not include me, but two of my matches descend from a common ancestor and that common ancestor between them might be a clue as to why I match both of them.

Ethnicity Continues to be Popular – But Is No Shortcut to Genealogy

Ethnicity is always popular. People want to “do their DNA” and find out where they come from. I understand. I really do. Who doesn’t just want an answer?

Of course, it’s not that simple, but that doesn’t mean it’s not disappointing to people who test for that purpose with high expectations. Hopefully, ethnicity will pique their curiosity and encourage engagement.

All four major vendors rolled out updated ethnicity results or related tools in 2020.

The future for ethnicity, I believe, will be held in integrated tools that allow us to use ethnicity results for genealogy, including being able to paint our ethnicity on our chromosomes as well as perform segment matching by ethnicity.

For example, if I carry an African segment on chromosome 1 from my father, and I match one person from my mother’s side and one from my father’s side on that same segment – one or the other of those people should also have that segment identified as African. That information would inform me as to which match is paternal and which is maternal

Not only that, this feature would help immensely tracking ancestors back in time and identifying their origins.

Will we ever get there? I don’t know. I’m not sure ethnicity is or can be accurate enough. We’ll see.

Transition to Digital and Online

Sometimes the future drags us kicking and screaming from the present.

With the imposed isolation of 2020, conferences quickly moved to an online presence. The genealogy community has all pulled together to make this work. The joke is that 2020’s most used phrase is “can you hear me?” I can vouch for that.

Of course while the year 2020 is over, the problem isn’t and is extending at least through the first half of 2021 and possibly longer. Conferences are planned months, up to a year, in advance and they can’t turn on a dime, so don’t even begin to expect in-person conferences until either late in 2021 or more likely, 2022 if all goes well this year.

I expect the future will eventually return to in-person conferences, but not entirely.

Finding ways to be more inclusive allows people who don’t want to or can’t travel or join in-person to participate.

I’ve recorded several sessions this year, mostly for 2021. Trust me, these could be a comedy, mostly of errors😊

I participated in four MyHeritage Facebook LIVE sessions in 2020 along with some other amazing speakers. This is what “live” events look like today!

Screenshot courtesy MyHeritage

A few days ago, I asked MyHeritage for a list of their LIVE sessions in 2020 and was shocked to learn that there were more than 90 in English, all free, and you can watch them anytime. Here’s the MyHeritage list.

By the way, every single one of the speakers is a volunteer, so say a big thank you to the speakers who make this possible, and to MyHeritage for the resources to make this free for everyone. If you’ve ever tried to coordinate anything like this, it’s anything but easy.

Additonally, I’ve created two Webinars this year for Legacy Family Tree Webinars.

Geoff Rasmussen put together the list of their top webinars for 2020, and I was pleased to see that I made the top 10! I’m sure there are MANY MORE you’d be interested in watching. Personally, I’m going to watch #6 yet today! Also, #9 and #22. You can always watch new webinars for free for a few days, and you can subscribe to watch all webinars, here.

The 2021 list of webinar speakers has been announced here, and while I’m not allowed to talk about something really fun that’s upcoming, let’s just say you definitely have something to look forward to in the springtime!

Also, don’t forget to register for RootsTech Connect which is entirely online and completely free, February 25-27, here.

Thank you to Penny Walters for creating this lovely graphic.

There are literally hundreds of speakers providing sessions in many languages for viewers around the world. I’ve heard the stats, but we can’t share them yet. Let me just say that you will be SHOCKED at the magnitude and reach of this conference. I’m talking dumbstruck!

During one of our zoom calls, one of the organizers says it feels like we’re constructing the plane as we’re flying, and I can confirm his observation – but we are getting it done – together! All hands on deck.

I’ll be presenting an advanced session about triangulation as well as a mini-session in the FamilySearch DNA Resource Center about finding your mother’s ancestors. I’ll share more information as it’s released and I can.

Companies and Owners Come & Go

You probably didn’t even notice some of these 2020 changes. Aside from the death of Bryan Sykes (RIP Bryan,) the big news and the even bigger unknown is the acquisition of Ancestry by Blackstone. Recently the CEO, Margo Georgiadis announced that she was stepping down. The Ancestry Board of Directors has announced an external search for a new CEO. All I can say is that very high on the priority list should be someone who IS a genealogist and who understands how DNA applies to genealogy.

Other changes included:

In the future, as genealogy and DNA testing becomes ever more popular and even more of a commodity, company sales and acquisitions will become more commonplace.

Some Companies Reduced Services and Cut Staff

I understand this too, but it’s painful. The layoffs occurred before Covid, so they didn’t result from Covid-related sales reductions. Let’s hope we see renewed investment after the Covid mess is over.

In a move that may or may not be related to an attempt to cut costs, Ancestry removed 6 and 7 cM matches from their users, freeing up processing resources, hardware, and storage requirements and thereby reducing costs.

I’m not going to beat this dead horse, because Ancestry is clearly not going to move on this issue, nor on that of the much-requested chromosome browser.

Later in the year, 23andMe also removed matches and other features, although, to their credit, they have restored at least part of this functionality and have provided ethnicity updates to V3 and V4 kits which wasn’t initially planned.

It’s also worth noting that early in 2020, 23andMe laid off 100 people as sales declined. Since that time, 23andMe has increasingly pushed consumers to pay to retest on their V5 chip.

About the same time, Ancestry also cut their workforce by about 6%, or about 100 people, also citing a slowdown in the consumer testing market. Ancestry also added a health product.

I’m not sure if we’ve reached market saturation or are simply seeing a leveling off. I wrote about that in DNA Testing Sales Decline: Reason and Reasons.

Of course, the pandemic economy where many people are either unemployed or insecure about their future isn’t helping.

The various companies need some product diversity to survive downturns. 23andMe is focused on medical research with partners who pay 23andMe for the DNA data of customers who opt-in, as does Ancestry.

Both Ancestry and MyHeritage provide subscription services for genealogy records.

FamilyTreeDNA is part of a larger company, GenebyGene whose genetics labs do processing for other companies and medical facilities.

A huge thank you to both MyHeritage and FamilyTreeDNA for NOT reducing services to customers in 2020.

Scientific Research Still Critical & Pushes Frontiers

Now that DNA testing has become a commodity, it’s easy to lose track of the fact that DNA testing is still a scientific endeavor that requires research to continue to move forward.

I’m still passionate about research after 20 years – maybe even more so now because there’s so much promise.

Research bleeds over into the consumer marketplace where products are improved and new features created allowing us to better track and understand our ancestors through their DNA that we and our family members inherit.

Here are a few of the research articles I published in 2020. You might notice a theme here – ancient DNA. What we can learn now due to new processing techniques is absolutely amazing. Labs can share files and information, providing the ability to “reprocess” the data, not the DNA itself, as more information and expertise becomes available.

Of course, in addition to this research, the Million Mito Project team is hard at work rewriting the tree of womankind.

If you’d like to participate, all you need to do is to either purchase a full sequence mitochondrial DNA kit at FamilyTreeDNA, or upgrade to the full sequence if you tested at a lower level previously.

Predictions

Predictions are risky business, but let me give it a shot.

Looking back a year, Covid wasn’t on the radar.

Looking back 5 years, neither Genetic Affairs nor DNAPainter were yet on the scene. DNAAdoption had just been formed in 2014 and DNAGedcom which was born out of DNAAdoption didn’t yet exist.

In other words, the most popular tools today didn’t exist yet.

GEDmatch, founded in 2010 by genealogists for genealogists was 5 years old, but was sold in December 2019 to Verogen.

We were begging Ancestry for a chromosome browser, and while we’ve pretty much given up beating them, because the horse is dead and they can sell DNA kits through ads focused elsewhere, that doesn’t mean genealogists still don’t need/want chromosome and segment based tools. Why, you’d think that Ancestry really doesn’t want us to break through those brick walls. That would be very bizarre, because every brick wall that falls reveals two more ancestors that need to be researched and spurs a frantic flurry of midnight searching. If you’re laughing right now, you know exactly what I mean!

Of course, if Ancestry provided a chromosome browser, it would cost development money for no additional revenue and their customer service reps would have to be able to support it. So from Ancestry’s perspective, there’s no good reason to provide us with that tool when they can sell kits without it. (Sigh.)

I’m not surprised by the management shift at Ancestry, and I wouldn’t be surprised to see several big players go public in the next decade, if not the next five years.

As companies increase in value, the number of private individuals who could afford to purchase the company decreases quickly, leaving private corporations as the only potential buyers, or becoming publicly held. Sometimes, that’s a good thing because investment dollars are infused into new product development.

What we desperately need, and I predict will happen one way or another is a marriage of individual tools and functions that exist separately today, with a dash of innovation. We need tools that will move beyond confirming existing ancestors – and will be able to identify ancestors through our DNA – out beyond each and every brick wall.

If a tester’s DNA matches to multiple people in a group descended from a particular previously unknown couple, and the timing and geography fits as well, that provides genealogical researchers with the hint they need to begin excavating the traditional records, looking for a connection.

In fact, this is exactly what happened with mitochondrial DNA – twice now. A match and a great deal of digging by one extremely persistent cousin resulting in identifying potential parents for a brick-wall ancestor. Autosomal DNA then confirmed that my DNA matched with 59 other individuals who descend from that couple through multiple children.

BUT, we couldn’t confirm those ancestors using autosomal DNA UNTIL WE HAD THE NAMES of the couple. DNA has the potential to reveal those names!

I wrote about that in Mitochondrial DNA Bulldozes Brick Wall and will be discussing it further in my RootsTech presentation.

The Challenge

We have most of the individual technology pieces today to get this done. Of course, the combined technological solution would require significant computing resources and processing power – just at the same time that vendors are desperately trying to pare costs to a minimum.

Some vendors simply aren’t interested, as I’ve already noted.

However, the winner, other than us genealogists, of course, will be the vendor who can either devise solutions or partner with others to create the right mix of tools that will combine matching, triangulation, and trees of your matches to each other, even if you don’t’ share a common ancestor.

We need to follow the DNA past the current end of the branch of our tree.

Each triangulated segment has an individual history that will lead not just to known ancestors, but to their unknown ancestors as well. We have reached critical mass in terms of how many people have tested – and more success would encourage more and more people to test.

There is a genetic path over every single brick wall in our genealogy.

Yes, I know that’s a bold statement. It’s not future Jetson’s flying-cars stuff. It’s doable – but it’s a matter of commitment, investment money, and finding a way to recoup that investment.

I don’t think it’s possible for the one-time purchase of a $39-$99 DNA test, especially when it’s not a loss-leader for something else like a records or data subscription (MyHeritage and Ancestry) or a medical research partnership (Ancestry and 23andMe.)

We’re performing these analysis processes manually and piecemeal today. It’s extremely inefficient and labor-intensive – which is why it often fails. People give up. And the process is painful, even when it does succeed.

This process has also been made increasingly difficult when some vendors block tools that help genealogists by downloading match and ancestral tree information. Before Ancestry closed access, I was creating theories based on common ancestors in my matches trees that weren’t in mine – then testing those theories both genetically (clusters, AutoTrees and ThruLines) and also by digging into traditional records to search for the genetic connection.

For example, I’m desperate to identify the parents of my James Lee Clarkson/Claxton, so I sorted my spreadsheet by surname and began evaluating everyone who had a Clarkson/Claxton in their tree in the 1700s in Virginia or North Carolina. But I can’t do that anymore now, either with a third-party tool or directly at Ancestry. Twenty million DNA kits sold for a minimum of $79 equals more than 1.5 billion dollars. Obviously, the issue here is not a lack of funds.

Including Y and mitochondrial DNA resources in our genetic toolbox not only confirms accuracy but also provides additional hints and clues.

Sometimes we start with Y DNA or mitochondrial DNA, and wind up using autosomal and sometimes the reverse. These are not competing products. It’s not either/or – it’s *and*.

Personally, I don’t expect the vendors to provide this game-changing complex functionality for free. I would be glad to pay for a subscription for top-of-the-line innovation and tools. In what other industry do consumers expect to pay for an item once and receive constant life-long innovations and upgrades? That doesn’t happen with software, phones nor with automobiles. I want vendors to be profitable so that they can invest in new tools that leverage the power of computing for genealogists to solve currently unsolvable problems.

Every single end-of-line ancestor in your tree represents a brick wall you need to overcome.

If you compare the cost of books, library visits, courthouse trips, and other research endeavors that often produce exactly nothing, these types of genetic tools would be both a godsend and an incredible value.

That’s it.

That’s the challenge, a gauntlet of sorts.

Who’s going to pick it up?

I can’t answer that question, but I can say that 23andMe can’t do this without supporting extensive trees, and Ancestry has shown absolutely no inclination to support segment data. You can’t achieve this goal without segment information or without trees.

Among the current players, that leaves two DNA testing companies and a few top-notch third parties as candidates – although – as the past has proven, the future is uncertain, fluid, and everchanging.

It will be interesting to see what I’m writing at the end of 2025, or maybe even at the end of 2021.

Stay tuned.

_____________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Products and Services

Genealogy Research

Books

Fun Genealogy Activities for Trying Times

My mother used to say that patience is a virtue.

patience stones.jpg

I’m afraid I’m not naturally a very virtuous person, at least not where patience is concerned. I don’t seem to take after my ancestor, Patience Brewster (1600-1634.) Perhaps those “patience” genes didn’t make it to my generation. Or maybe Patience wasn’t very patient herself.

Not only does patience not come naturally to me, it’s more difficult for everyone during stressful times. People are anxious, nerves are frazzled and tempers are short. Have you noticed that recently?

I guess you could say that what we’ve been enduring, in terms of both health issues and/or preparation for the Covid-19 virus along with the economic rollercoaster – not to mention the associated politics, is stress-inducing.

patience stress.png

Let’s see:

  • Worry about a slow-motion epidemic steamrollering the population as it wraps around the world – check.
  • Worry about family members – check.
  • Worry about TP, hand sanitizer, food, medication and other supplies – check.
  • Worry about jobs and income – check.
  • Worry about retirement accounts and medical bills – check.
  • Worry about long-term ramifications – check.

Nope, no stress here. What about you?

And yes, I’m intentionally understated, hoping to at least garner a smile.

Once you’ve stocked up on what you need and decided to stay home out of harm’s way – or more to the point, out of germ’s way – how can you feel more patient and less stressed?

I have some suggestions!

patience stress relief.png

The Feel Better Recipe

First, just accept that once you’ve done what you can do to help yourself, which includes minimizing exposure – there’s little else that you can do. I wrote about symptoms and precautions, here. The best thing you can do is wash, stay home and remain vigilant.

If someone you know or love doesn’t understand why we need to limit or eliminate social interaction at this point, here’s an article that explains how NOT to be stupid, as well as an article here about what flattening the curve means and why social distancing is our only prayer at this point to potentially avoid disaster. We are all in this together and we all have a powerful role to play – just by staying at home.

Educating and encouraging others to take precautionary steps might help, but worrying isn’t going to help anything because you can’t affect much beyond your own sphere of influence. As much as we wish we could affect the virus itself, or increase the testing supply, or influence good decision-making by others, we generally can’t.

What can we do, aside from sharing precautionary information and hoping that we are “heard?”

We can try to release the worry.

patience zen.png

If you sit there thinking about releasing the worry, which means you’re focused on worrying – that’s probably not going to be very productive.

Neither is drinking your entire supply of Jack Daniels in one sitting – not the least of which is because you may need that as hand sanitizer down the road a bit. Oh, wait, hand sanitizer is supposed to be more than 60% alcohol, which would be 120 proof. Never mind, go ahead and drink the Jack Daniels😊

What you really need is a distraction. Preferably a beneficial distraction that won’t give you a hangover. Not like my distraction this past month when the washing machine flooded through the floor into the basement including my office below. No, not that kind of distraction.

Some folks can “escape the world,” in a sense, by watching TV, but I’m not one of those people. I need to engage my mind with some sort of structure and I want to feel like I’m accomplishing something. If you’re a “TV” person, you’re probably watching TV now and not reading this anyway – so I’m guessing that’s not my readership audience, by and large.

Beneficial Distractions

Here are 20 wonderful ideas for fun and useful things to do – and guess what – they aren’t all genealogy related. Let’s start with something that will make you feel wonderful.

labyrinth

  1. Take a walk – outside, but not around other people. Your body and mind will thank you. Your body likes to move and exercise generates beneficial feel-good endorphins, reducing anxiety. Remember to take hand sanitizer with you and open doors by pushing with your arm or hip, if possible. Also, if you need to get fuel for your vehicle, take disposable gloves to handle the pump. Disinfectant, soap and water is your friend – maybe your best friend right now.

patience books.png

  1. Read a book. Escapism, pure and simple. I have a stack of books just waiting. If you don’t, you can download e-books to your Kindle or iPad or phone directly from Amazon without going anyplace or have books delivered directly to your door. Try Libby Copeland’s The Lost Family, which you can order here. It’s dynamite. (My brother and my story are featured, which I wrote about here.) If you’d like DNA education, you can order Diahan Southard’s brand new book, Your DNA Guide: Step by Step Plans, here. I haven’t read Diahan’s book, but I’m familiar with the quality of her work and don’t have any hesitation about recommending it. (Let me know what you think.) And hey, you don’t even need hand sanitizer for this!

patience check box.png

  1. Check your DNA matches at all the vendors where you’ve tested. If you don’t check daily, now would be a good time to catch up. Not just autosomal matches, but also Y and mitochondrial at Family Tree DNA. Those tests often get overlooked. Maybe some of your matches have updated their trees or earliest known ancestor information.

patience tree.png

  1. Speaking of trees, update your trees on the three DNA/genealogy sites that support trees: FamilyTreeDNA, MyHeritage and Ancestry. Keeping your tree up to date through at least the 8th generation (including their children) enables the companies to more easily connect the dots for their helpful tools like Phased Family Matching aka bucketing at FamilyTreeDNA, Theories of Family Relativity aka TOFR at MyHeritage and ThruLines at Ancestry.

patience connect.png

  1. Connect your known matches to their appropriate place on your tree at Family Tree DNA, as illustrated above. This provides fuel for Family Tree DNA to be able to designate your matches as maternal or paternal, even if your mother and father haven’t tested. In this case, I’ve connected my first cousin once removed who matches me in her proper location in my tree. People who match my cousin and I both are assigned to my maternal bucket.

patience y dna.pngpatience mtdna.png

  1. Order or upgrade a Y DNA or mitochondrial DNA test or a Family Finder autosomal test for you or a family member at Family Tree DNA. Upgrades, shown above, are easy if the tester has already taken at least one test, because DNA is banked at the lab for future orders. You don’t have to go anyplace to do this and DNA testing results and benefits last forever. Your DNA works for you 24x7x365.

patience join project.png

patience projects.png

  1. Join a free project at FamilyTreeDNA. Those can be surname projects, haplogroup projects, regional projects such as Acadian AmeriIndian and other interest topics like American Indian. You can search or browse for projects of interest and collaborate with others. Projects are managed by volunteer administrators who obviously have an interest in the project’s topic.

patience match.png

  1. At each of the vendors, find your highest autosomal match whom you cannot place as a relative. Work on their line via tree construction and then utilizing clustering using Genetic Affairs. I wrote about Genetic Affairs, an amazing tool, here, which you can try for free.

patience familysearch wiki.png

patience claiborne.png

  1. Check the FamilySearch WIKI for your genealogy locations by googling “Claiborne County, Tennessee FamilySearch wiki” where you substitute the location of where you are searching for “Claiborne County, Tennessee.” FamilySearch is free and the WIKI includes resources outside of FamilySearch itself, including paid and other free sites.

patience familysearch records.png

  1. While you’re at it, if you haven’t already, create a FamilySearch account and create or upload a tree to FamilySearch. It will be connected to branches of existing trees to create one large worldwide tree. Yes, you’ll be frustrated in some cases because there are incorrect ancestors sometimes listed in the “big tree” – BUT – there are procedures in place to remediate that situation. The important aspect is that FamilySearch, which is free, provides hints and resources not available any other place for some ancestors. Not long ago, I found a detailed estate packet that I had no idea existed – for a female ancestor no less. You can search at FamilySearch for ancestors, genealogies, records and in other ways. New records become available often.  This will keep you occupied for days, I promise!

Patience Journal.png

  1. Begin a Novel Coronavirus Covid-19 Pandemic journal. Think of your descendants 100 years in the future. Wouldn’t you like to know what your great-grandparents were doing during the 1918 Spanish Flu Pandemic? Or even their siblings or neighbors, because that was likely similar to what your ancestors were doing as well. You don’t have to write much daily – just write. Not just facts, but how you feel as well. Are you afraid, concerned specifically about someone? What’s going on with you – in your mind? That’s the part of you that your descendants will long to know a century from now.

Quilt rose

  1. Create something with your hands. I made a quilt this week for an ailing friend, unrelated to this epidemic. No, I didn’t “have time” to do that, but I made time because this quilt is important, and I know they need the “get well’’” wishes and love that quilt will wrap them in. It always feels good to do something for someone else.

patience gardening.jpg

  1. Garden, or in my case, that equates to pulling weeds. Not only is weeding productive, you can work off frustration by thinking about someone or something that upsets you as you yank those weeds out by their roots. Of course, that means you’ll have to first decide what is, and is not, a weed😊. That could be the toughest part.

patience smart matches.png

  1. At MyHeritage, you can use Irish records for free this month, plus try a free subscription, here in order to access all the rest of the millions of records available at MyHeritage. Check for Smart Matches for ancestors, shown above, and confirm that they are accurate, meaning that the ancestor the other person has in their tree is the same person as you have in your tree – even if they aren’t exactly identical. You don’t need to import any of their information, and I would suggest that you don’t without reviewing every piece of information individually. Confirming Smart Matches helps MyHeritage build Theories of Family Relativity – not to mention you may discover additional information about your ancestors. While you’re checking Smart Matches, who ARE those other people with your grandmother in their tree. Are they relatives who might have information that you don’t? This is a good opportunity to reach out. And what are those 12 pending record matches? Inquiring minds want to know. Let’s check.
patience newspapers

Click to enlarge.

  1. Check either NewsPapers.com or the Newspaper collection at MyHeritage, or both, systematically, for each ancestor. You never know what juicy tidbits you might discover about your ancestors. Often, things “forgotten” by families are the informative morsels you’ll want to know and are hidden in those local news articles. These newsy community newspapers bring the life and times of our ancestors to light in ways nothing else can. Wait, what? My Brethren ancestor, Hiram Ferverda, pleaded guilty to something??? I’d better read this article!

patience interview.png

  1. Interview your relatives. Make a list of questions you’d like for them to answer about themselves and the most distant common ancestors that they knew, or knew about. You can conduct interviews without being physically together via the phone or Skype or Facetime. Document what was said for the future, in writing, and possibly by recording as well. After someone has passed, hearing their voice again is priceless.

Upload download

  1. Transfer your DNA file to vendors that accept transfers, getting more bang for your testing dollars by finding more matches. 23andMe and Ancestry don’t accept transfers.  At MyHeritage and FamilyTreeDNA, transfers are free and so is matching, but advanced tools require a small unlock fee. I wrote a step-by-step series about how to transfer, here. Each article includes instructions for transferring from or to Ancestry, MyHeritage, 23andMe and FamilyTreeDNA. Don’t forget to upload to GedMatch for additional tools.

patience brick wall.jpg

  1. Focus on your most irritating brick wall and review what records you do, and don’t have that could be relevant. That would include local, county, state and federal records, tax lists, census, church records and minutes and local histories if they exist. Have you called the local library and asked about vertical files or other researchers? What about state archive resources? Don’t forget activities like google searches. Have you utilized all possible DNA clues, including Y DNA and mitochondrial DNA, if applicable? How about third-party tools like Genetic Affairs and DNAgedcom?

patience DNApainter.png

  1. Try DNAPainter, for free. Painting your chromosomes and walking those segments back in time to your ancestors from whom they descended is so much fun. Not to mention you can integrate ethnicity and now traits, too. I’ve written instructions for using using DNAPainter in a variety of ways, here.

patience webinars.png

  1. Expand your education by watching webinars at Legacy Family Tree Webinars. Many are free and a yearly subscription is very reasonable. Take a look, here.

patience bucket.png

  1. Spring cleaning your house or desk. Ewww – cleaning – the activity that is never done and begins undoing itself immediately after you’ve finished? Makes any of the above 20 activities sound wonderful by comparison, right? I agree, so pick one and let’s get started!

Let me know what you find. Write about your search activities and discoveries in your Pandemic journal too.

_____________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Products and Services

Genealogy Research

Fun DNA Stuff

  • Celebrate DNA – customized DNA themed t-shirts, bags and other items

DNAPainter: Painting “Bucketed” Family Tree DNA Maternal and Paternal Family Finder Matches in One Fell Swoop

DNAPainter has done it again, providing genealogists with a wonderful tool that facilitates separating your matches into maternal and paternal categories so that they can be painted on the proper chromosome – in one fell swoop no less.

Of course, the entire purpose of painting your chromosomes is to identify segments that descend from specific ancestors in order to push those lines back further in time genealogically. Identifying segments, confirming and breaking down brick walls is the name of the game.

DNA Painter New Import Tool

The new DNAPainter tool relies on Family Tree DNA’s Phased Family Matching which assigns your matches to maternal and paternal buckets. On your match list, at the top, you’ll see the following which indicates how many matches you have in total and how many people are assigned to each bucket.

DNAPainter FF import.png

Note that these are individual matches, not total matching segments – that number would be higher.

In order for Family Tree DNA to create bucketed matches for you, you’ll need to:

  • Either create a tree or upload a GEDCOM file
  • Attach your DNA kit to “you” in your tree
  • Attach all 4th cousins and closer with whom you match to their proper location on your tree

Yes, it appears that Family Tree DNA is now using 4th cousins, not just third cousins and closer, which provides for additional bucketed matches.

How reliable is bucketing?

Quite. Occasionally one of two issues arise which becomes evident if you actually compare the matches’ segments to the parent with whom they are bucketed:

  • One or more of your matches’ segments do match you and your parent, but additionally, one or more segments match you, but not your parent
  • The X chromosome is particularly susceptible to this issue, especially with lower cM matches
  • Occasionally, a match that is large enough to be bucketed isn’t, likely because no known, linked cousin shares that segment

Getting Started

Get started by creating or uploading your tree at Family Tree DNA.

DNAPainter mytree.png

After uploading your GEDCOM file or creating your tree at Family Tree DNA, click on the “matches” icon at the top of the tree to link yourself and your relatives to their proper places on your tree. Your matches will show in the box below the helix icon.

DNAPainter FF matches.png

I created an example “twin” for myself to use for teaching purposes by uploading a file from Ancestry, so I’m going to attach that person to my tree as my “Evil Twin.” (Under normal circumstances, I do not recommend uploading duplicate files of anyone.)

DNAPainter FF matches link.png

Just drag and drop the person on your match list on top of their place on the tree.

DNAPainter Ff sister.png

Here I am as my sister, Example Adoptee.

I’ve wished for a very, very long time that there was a way to obtain a list of segment matches sorted by maternal and paternal bucket without having to perform spreadsheet gymnastics, and now there is, at DNAPainter.

DNAPainter does the heavy-lifting so you don’t have to.

What Does DNAPainter Do with Bucketed Matches?

When you are finished uploading two files at DNAPainter, you’ll have:

  • Maternal groups of triangulated matches
  • Paternal groups of triangulated matches
  • Matches that could not be assigned based on the bucketing. Some (but not all) of these matches will be identical by chance – typically roughly 15-20% of your match list. You can read about identical by chance, here.

I’ll walk you through the painting process step by step.

First, you need to be sure your relatives are connected to your tree at Family Tree DNA so that you have matches assigned to your maternal and paternal buckets. The more relatives you connect, per the instructions in the previous section, the more matching people will be able to be placed into maternal or paternal buckets.

Painting Bucketed Matches at DNAPainter

I wrote basic articles about how to use DNAPainter here. If you’re unfamiliar with how to use DNAPainter or it’s new to you, now would be a good time to read those articles. This next section assumes that you’re using DNAPainter. If not, go ahead, register, and set up a profile. One profile is free for everyone, but multiple profiles require a subscription.

First, make a duplicate of the profile that you’re working with. This DNAPainter upload tool is in beta.

DNAPainter duplicate profile.png

Since I’m teaching and experimenting, I am using a fresh, new profile for this experiment. If it works successfully, I’ll duplicate my working profile, just in case something goes wrong or doesn’t generate the results I expect, and repeat these steps there.

Second, at Family Tree DNA, Download a fresh copy of your complete matching segment file. This “Download Segments” link is found at the top right of the chromosome browser page.

DNAPainter ff download segments.png

Third, download your matches at the bottom left of the actual matches page. This file hold information about your matches, such as which ones are bucketed, but no segment information. That’s in the other file.

DNAPainter csv.png

Name both of these files something you can easily identify and that tells them apart. I called the first one “Segments” in front of the file name and the second one “Matches” in front of the file name.

Fourth, at DNAPainter, you’ll need to import your entire downloaded segment file that you just downloaded from Family Tree DNA. I exclude segments under 7cM because they are about 50% identical by chance.

DNAPainter import instructions

click to enlarge

Select the segment file you just named and click on import.

DNAPainter both.png

At this point, your chromosomes at DNAPainter will look like this, assuming you’re using a new profile with nothing else painted.

Let’s expand chromosome 1 and see what it looks like.

DNAPainter chr 1 both.png

Note that all segments are painted over both chromosomes, meaning both the maternal and paternal copies of chromosome 1, partially shown above, because at this point, DNAPainter can’t tell which people match on the maternal and which people match on the paternal sides. The second “matches” file from Family Tree DNA has not yet been imported into DNAPainter, which tells DNAPainter which matches are on the maternal and which are on the paternal chromosomes.

If you’re not workign with a new profile, then you’ll also see the segments you’ve already painted. DNAPainter attempts to NOT paint segments that appear to have previously been painted.

Fifth, at DNAPainter, click on the “Import mat/pat info from ftDNA” link on the left which will provide you with a page to import the matches file information. This is the file that has maternal and paternal sides specified for bucketed matches. DNAPainter needs both the segment file, which you already imported, and the matches file.

DNAPainter import bucket

click to enlarge

After the second import, the “matches” file, my matches are magically redistributed onto their appropriate chromosomes based on the maternal and paternal bucketing information.

I love this tool!

At this point, you will have three groups of matches, assuming you have people assigned to your maternal and paternal buckets.

  • A “Shared” group for people who are related to both of your parents, or who aren’t designated as a bucketed match to either parent
  • Maternal group (pink chromosome)
  • Paternal group (blue chromosome)

It’s Soup!!!

I’m so excited. Now my matches are divided into maternal and paternal chromosome groups.

DNAPainter import complete.png

Just so you know, I changed the colors of my legend at DNAPainter using “edit group,” because all three groups were shades of pink after the import and I wanted to be able to see the difference clearly.

DNAPainter legend.png

Your Painted Chromosomes

Let’s take a look at what we have.

DNAPainter both, mat, pat.png

There’s still pink showing, meaning undetermined, which gets painted over both the maternal and paternal chromosomes, but there’s also a lot of magenta (maternal) and blue (paternal) showing now too as a result of bucketing.

Let’s look at chromosome 1.

DNAPainter chr 1 all.png

This detail, which is actually a summary, shows that the bucketed maternal (magenta) and paternal (blue) matches have actually covered most of the chromosome. There are still a few areas without coverage, but not many.

For a genealogist, this is beautiful!!!

How many matches were painted?

DNAPainter paternal total.png

DNAPainter maternal total.png

Expanding chromosome 1, and scrolling to the maternal portion, I can now see that I have several painted maternal segments, and almost the entire chromosome is covered.

Here’s the exciting part!

DNAPainter ch1 1 mat expanded.png

I stared the relatives I know, on the painting, above and on the pedigree chart, below. The green group descends through Hiram Ferverda and Eva Miller, the yellow group through Antoine Lore and Rachel Hill. The blue group is Acadian, upstream of Antoine Lore.

DNAPainter maternal pedigree.png

Those ancestors are shown by star color on my pedigree chart.

I can now focus on the genealogies of the other unstarred people to see if their genealogy can push those segments back further in time to older ancestors.

On my Dad’s side, the first part of chromosome 1 is equally as exciting.

DNAPainter chr 1 pat expanded.png

The yellow star only pushed this triangulated group back only to my grandparents, but the green star is from a cousin descended from my great-grandparents. The red star matches are even more exciting, because my common ancestor with Lawson is my brick wall – Marcus Younger and his wife, Susanna, surname unknown, parents of Mary Younger.

DNAPainter paternal pedigree.png

I need to really focus hard on this cluster of 12 people because THEIR common ancestors in their trees may well provide the key I need to push back another generation – through the brick wall. That is, after all, the goal of genetic genealogy.

Woohoooo!

Manual Spreadsheet Compare

Because I decided to torture myself one mid-winter day, and night, I wanted to see how much difference there is between the bucketed matches that I just painted and actual matches that I’ve identified by downloading my parents’ segment match files and mine and comparing them manually against each other. I removed any matches in my file that were not matches to my parent, in addition to me, then painted the rest.

I’ll import the resulting manual spreadsheet into the same experimental DNAPainter profile so we can view matches that were NOT painted previously. DNAPainter does not paint matches previously painted, if it can tell the difference. Since both of these files are from downloads, without the name of the matches being in any way modified, DNAPainter should be able to recognize everyone and only paint new segment matches.

Please note here that the PERSON unquestionably belongs bucketed to the parental side in question, but not all SEGMENTS necessarily match you and your parent. Some will not, and those are the segments that I removed from my spreadsheet.

DNAPainter manual spreadsheet example.png

Here’s a made-up example where I’ve combined my matches and my mother’s matches in one spreadsheet in order to facilitate this comparison. I colored my Mom’s matches green so they are easy to see when comparing to my own, then sorting by the match name.

Person 1 matches me and Mom both, at 10 cM on chromosome 1. Person 1 is assigned to my maternal side due to the matches above 9 cM, the lowest threshold at Family Tree DNA for bucketing.

In this example, we can see that Person 1 matches me and Mom (colored green), both, on the segment on chromosome 1. That match, bracketed by red, is a valid, phased, match and should be painted.

However, Person 1 also matches me, but NOT Mom on chromosome 2. Because Person 1 is bucketed to mother, this segment on chromosome 2 will also be painted to my maternal chromosome 2 using the DNAPainter import. The only way to sort this out is to do the comparison manually.

The same holds true for the X match shown. The two segments shown in red should NOT be painted, but they will be unless you are willing to compare you and your parents’ matches manually, you will just have to evaluate segments individually when you see that you’re working in a cluster where matches have been assigned through the mass import tool.

If you choose to compare the spreadsheets manually to assure that you’re not painting segments like the red ones above, DNAPainter provides instructions for you to create your own mass upload template, which is what I did after removing any segment matches of people that were not “in common” between me and mother on the same chromosomal segment, like the red ones, above.

Please note that if you delete the erroneous segments and later reimport your bucketed matches, they will appear again. I’m more inclined to leave them, making a note.

I did not do a manual comparison of my father’s side of the tree after discovering just how little difference was found on my mother’s side, and how much effort was involved in the manual comparison.

Creating a Mass Upload Template and File

DNAPainter custom mass upload.png

The instructions for creating your own mass upload file are provided by DNAPainter – please follow them exactly.

In my case, after doing the manual spreadsheet compare with my mother, only a total of 18 new segments were imported that were not previously identified by bucketing.

Three of those segments were over 15cM, but the rest were smaller. I expected there would be more. Family Tree DNA is clearly doing a great job with maternal and paternal bucketing assignments, but they can’t do it without known relatives that have also tested and are linked to your tree. The very small discrepancy is likely due to matches with cousins that I have not been able to link on my tree.

The great news is that because DNAPainter recognizes already-painted segments, I can repeat this anytime and just paint the new segments, without worrying about duplicates.

  • The information above pertains to segments that should have been painted, but weren’t.
  • The information below pertains to segments that were painted, but should not have been.

I did not keep track of how many segments I deleted that would have erroneously been painted. There were certainly more than 18, but not an overwhelming number. Enough though to let me know to be careful and confirm the segment match individually before using any of the mass uploaded matches for hypothesis or conclusions.

Given that this experiment went well, I created a copy of my “real” profile in order to do the same import and see what discoveries are waiting!

Before and After

Before I did the imports into my “real” file (after making a copy, of course,) I had painted 82% of my DNA using 1700 segments. Of course, each one of those segments in my original profile is identified with an ancestor, even if they aren’t very far back in time.

Although I didn’t paint matches in common with my mother before this mass import, each of my matches in common with my mother are in common with one or the other of my maternal grandparents – and by using other known matches I can likely push the identity of those segments further back in time.

Status Percent Segments Painted
Before mass Phased Family Match bucketed import 82 1700
After mass Phased Family Match bucketed import 88 7123
After additional manual matches with my mother added 88 7141

While I did receive 18 additional matching segments by utilizing the manually intensive spreadsheet matching and removal process, I did not receive enough more matches to justify the hours and hours of work. I won’t be doing that anymore with Family Tree DNA files since they have so graciously provided bucketing and DNAPainter can leverage that functionality.

Those hours will be much better spent focusing on unraveling the ancestors whose stories are told in clusters of triangulated matches.

I Love The Import Tool, But It’s Not Perfect

Keep in mind that the X chromosome needs a match of approximately twice the size of a regular chromosome to be as reliable. In other words, a 14 cM threshold for the X chromosome is roughly equivalent to a 7 cM match for any other chromosome. Said another way, a 7 cM match on the X is about equal to a 3.5 cM match on any other chromosome.

X matches are not created equal.

The SNP density on the X chromosome is about half that of the other chromosomes, making it virtually impossible to use the same matching criteria. I don’t encourage using matches of less than 500 SNPs unless you know you’re in a triangulated group and WITH at least a few larger, proven matches on that segment of the X chromosome.

Having said that, X matches, due to their unique inheritance path can persist for many generations and be extremely useful. You can read about working with the X chromosome here and here.

I noticed when I was comparing segments in the manual spreadsheet that I had to remove many X matches with people who had identical matches on other chromosomes with me and my mother. In other words, just because they matched my mother and me exactly on one chromosome, that phasing did not, by default, extend to matching on other segments.

I checked my manually curated file and discovered that I had a total of seven X matches that should have been, and were, painted because they matched me and Mom both.

DNAPainter X spreadsheet example.png

However, there were many that didn’t match me and Mom both, matching only me, that were painted because that person was bucketed (assigned) to my maternal side because a different segment phased to mother correctly.

On the X chromosome, here’s what happened.

DNAPainter maternal X.png

You can see that a lot more than 7 bright red matches were painted – 26 more to be exact. That’s because if an individual is bucketed on your maternal or paternal side, it’s presumed that all of the matching segments come from the same ancestor and are legitimate, meaning identical by descent and not by chance. They aren’t. Every single segment has an inheritance path and story of its own – and just because one segment triangulates does NOT mean that other segments that match that person will triangulate as well.

The X chromosome is the worst case scenario of course, because these 7 cM segments are actually as reliable as roughly 3.5 cM segments on any other chromosome, which is to say that more than 50% of them will be incorrect. However, some will be accurate and those will match me and mother both. 21% of the X matches to people who phased and triangulated on other chromosomes were accurate – 79% were not. Thankfully, we have phasing, bucketing and tools like this to be able to tell the difference so we can utilize the 21% that are accurate. No one wants to throw the baby out with the bath water, nor do we want to chase after phantoms.

Keep in mind that Phased Family Matching, like any other tool, is just that, a tool and needs some level of critical analysis.

Every Segment Has Its Own Story

We know that every single DNA segment has an independent inheritance path and story of its own. (Yes, I’ve said that several time now because it’s critically important so that you don’t wind up barking up the wrong tree, literally, pardon the pun.)

In the graphic above of my painted X chromosome matches, only the six matches with green stars are on the hand-curated match list. One had already been painted previously. The balance of the bright red matches were a part of the mass import and need to be deleted. Additionally, one of the accurate matches did not upload for some reason, so I’ll add that one manually.

I suggest that you go ahead and paint your bucketed segments, but understand that you may have a red herring or two in your crop of painted segment matches.

As you begin to work with these clusters of matches, check your matching segments with your parents (or other family members who were used in bucketing) and make sure that all the segments that have been painted by bulk upload actually match on all of the same segments.

If you have a parent that tested, there is no need to see if you and your match match other relatives on that same side. If your match does not match you and your parent on some significant overlapping portion of that same segment, the match is invalid. DNA does not “skip generations.”

If you don’t have a parent that has tested, your known relatives are your salvation, and the key to bucketed matches.

The great news is that you can easily see that a bulk match was painted from the coloring of the batch import. As you discover the relevant genealogy and confirm that all segments actually match your parent (or another family member, if you don’t have parents to test,) move the matching person to the appropriately colored ancestral group.

I further recommend that you hand curate the X chromosome using a spreadsheet. The nature of the X makes depending on phased matching too risky, especially with a tool like DNAPainter that can’t differentiate between a legitimate and non-legitimate match. The X chromosome matches are extraordinarily valuable because they can be useful in ways that other chromosomes can’t be due to the X’s unique inheritance path.

What About You?

If you don’t have your DNA at Family Tree DNA and you have tested elsewhere, you can transfer your DNA file for free, allowing you to see your matches and use many of the Family Tree DNA tools. However, to access the chromosome browser, which you’ll need for DNA painting, you’ll need to purchase the unlock for $19, but that’s still a lot less than retesting.

Here are transfer instructions for transferring your DNA file from 23andMe, Ancestry or MyHeritage.

If you have not purchased a Family Finder test at Family Tree DNA and don’t have a DNA file to transfer, you can order a test here.

_____________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Products and Services

Genealogy Research

Fun DNA Stuff

  • Celebrate DNA – customized DNA themed t-shirts, bags and other items

2019: The Year and Decade of Change

2019 ends both a year and a decade. In the genealogy and genetic genealogy world, the overwhelmingly appropriate word to define both is “change.”

Everything has changed.

Millions more records are online now than ever before, both through the Big 3, being FamilySearch, MyHeritage and Ancestry, but also through multitudes of other sites preserving our history. Everyplace from National Archives to individual blogs celebrating history and ancestors.

All you need to do is google to find more than ever before.

I don’t know about you, but I’ve made more progress in the past decade that in all of the previous ones combined.

Just Beginning?

If you’re just beginning with genetic genealogy, welcome! I wrote this article just for you to see what to expect when your DNA results are returned.

If you’ve been working with genetic genealogy results for some time, or would like a great review of the landscape, let’s take this opportunity to take a look at how far we’ve come in the past year and decade.

It’s been quite a ride!

What Has Changed?

EVERYTHING

Literally.

A decade ago, we had Y and mitochondrial DNA, but just the beginning of the autosomal revolution in the genetic genealogy space.

In 2010, Family Tree DNA had been in business for a decade and offered both Y and mitochondrial DNA testing.

Ancestry offered a similar Y and mtDNA product, but not entirely the same markers, nor full sequence mitochondrial. Ancestry subsequently discontinued that testing and destroyed the matching database. Ancestry bought the Sorenson database that included Y, mitochondrial and autosomal, then destroyed that data base too.

23andMe was founded in 2006 and began autosomal testing in 2007 for health and genealogy. Genealogists piled on that bandwagon.

Family Tree DNA added autosomal to their menu in 2010, but Ancestry didn’t offer an autosomal product until 2012 and MyHeritage not until 2016. Both Ancestry and MyHeritage have launched massive marketing and ad campaigns to help people figure out “who they are,” and who their ancestors were too.

Family Tree DNA

2019 FTDNA

Family Tree DNA had a banner year with the Big Y-700 product, adding over 211,000 Y DNA SNPs in 2019 alone to total more than 438,000 by year end, many of which became newly defined haplogroups. You can read more here. Additionally, Family Tree DNA introduced the Block Tree and public Y and public mitochondrial DNA trees.

Anyone who ignores Y DNA testing does so at their own peril. Information produced by Y DNA testing (and for that matter, mitochondrial too) cannot be obtained any other way. I wrote about utilizing mitochondrial DNA here and a series about how to utilize Y DNA begins in a few days.

Family Tree DNA remains the premier commercial testing company to offer high resolution and full sequence testing and matching, which of course is the key to finding genealogy solutions.

In the autosomal space, Family Tree DNA is the only testing company to provide Phased Family Matching which uses your matches on both sides of your tree, assuming you link 3rd cousins or closer, to assign other testers to specific parental sides of your tree.

Family Tree DNA accepts free uploads from other testing companies with the unlock for advanced features only $19. You can read about that here and here.

MyHeritage

MyHeritage, the DNA testing dark horse, has come from behind from their late entry into the field in 2016 with focused Europeans ads and the purchase of Promethease in 2019. Their database stands at 3.7 million, not as many as either Ancestry or 23andMe, but for many people, including me – MyHeritage is much more useful, especially for my European lines. Not only is MyHeritage a genealogy company, piloted by Gilad Japhet, a passionate genealogist, but they have introduced easy-to-use advanced tools for consumers during 2019 to take the functionality lead in autosomal DNA.

2019 MyHeritage.png

You can read more about MyHeritage and their 2019 accomplishments, here.

As far as I’m concerned, the MyHeritage bases-loaded 4-product “Home Run” makes MyHeritage the best solution for genetic genealogy via either testing or transfer:

  • Triangulation – shows testers where 3 or more people match each other. You can read more, here.
  • Tree Matching – SmartMatching for both DNA testers and those who have not DNA tested
  • Theories of Family Relativity – a wonderful new tool introduced in February. You can read more here.
  • AutoClusters – Integrated cluster technology helps you to visualize which groups of people match each other.

One of their best features, Theories of Family Relativity connects the dots between people you DNA match with disparate trees and other documents, such as census. This helps you and others break down long-standing brick walls. You can read more, here.

MyHeritage encourages uploads from other testing companies with basic functions such as matching for free. Advanced features cost either a one-time unlock fee of $29 or are included with a full subscription which you can try for free, here. You can read about what is free and what isn’t, here.

You can develop a testing and upload strategy along with finding instructions for how to upload here and here.

23andMe

Today, 23andMe is best known for health, having recovered after having had their wings clipped a few years back by the FDA. They were the first to offer Health results, leveraging the genealogy marketspace to attract testers, but have recently been eclipsed by both Family Tree DNA with their high end full Exome Tovana test and MyHeritage with their Health upgrade which provides more information than 23andMe along with free genetic counseling if appropriate. Both the Family Tree DNA and MyHeritage tests are medically supervised, so can deliver more results.

23andMe has never fully embraced genetic genealogy by adding the ability to upload and compare trees. In 2019, they introduced a beta function to attempt to create a genetic tree on your behalf based on how your matches match you and each other.

2019 23andMe.png

These trees aren’t accurate today, nor are they deep, but they are a beginning – especially considering that they are not based on existing trees. You can read more here.

The best 23andMe feature for genealogy, as far as I’m concerned, is their ethnicity along with the fact that they actually provide testers with the locations of their ethnicity segments which can help testers immensely, especially with minority ancestry matching. You can read about how to do this for yourself, here.

23andMe generally does not allow uploads, probably because they need people to test on their custom-designed medical chip. Very rarely, once that I know of in 2018, they do allow uploads – but in the past, uploaders do not receive all of the genealogy features and benefits of testing.

You can however, download your DNA file from 23andMe and upload elsewhere, with instructions here.

Ancestry

Ancestry is widely known for their ethnicity ads which are extremely effective in recruiting new testers. That’s the great news. The results are frustrating to seasoned genealogists who get to deal with the fallout of confused people trying to figure out why their results don’t match their expectations and family stories. That’s the not-so-great news.

However, with more than 15 million testers, many of whom DO have genealogy trees, a serious genealogist can’t *NOT* test at Ancestry. Testers do need to be aware that not all features are available to DNA testers who don’t also subscribe to Ancestry’s genealogy subscriptions. For example, you can’t see your matches’ trees beyond a 5 generation preview without a subscription. You can read more about what you do and don’t receive, here.

Ancestry is the only one of the major companies that doesn’t provide a chromosome browser, despite pleas for years to do so, but they do provide ThruLines that show you other testers who match your DNA and show a common ancestor with you in their trees.

2019 Ancestry.png

ThruLines will also link partial trees – showing you ancestral descendants from the perspective of the ancestor in question, shown above. You can read about ThruLines, here.

Of course, without a chromosome browser, this match is only as good as the associated trees, and there is no way to prove the genealogical connection. It’s possible to all be wrong together, or to be related to some people through a completely different ancestor. Third party tools like Genetic Affairs and cluster technology help resolve these types of issues. You can read more, here.

You can’t upload DNA files from other testing companies to Ancestry, probably due to their custom medical chip. You can download your file from Ancestry and upload to other locations, with instructions here.

Selling Customers’ DNA

Neither Family Tree DNA, MyHeritage nor Gedmatch sell, lease or otherwise share their customers’ DNA, and all three state (minimally) they will not in the future without prior authorization.

All companies utilize their customers’ DNA internally to enhance and improve their products. That’s perfectly normal.

Both Ancestry and 23andMe sell consumers DNA to both known and unknown partners if customers opt-in to additional research. That’s the purpose of all those questions.

If you do agree or opt-in, and for those who tested prior to when the opt-in began, consumers don’t know who their DNA has been sold to, where it is or for what purposes it’s being utilized. Although anonymized (pseudonymized) before sale, autosomal results can easily be identified to the originating tester (if someone were inclined to do so) as demonstrated by adoptees identifying parents and law enforcement identifying both long deceased remains and criminal perpetrators of violent crimes. You can read more about re-identification here, although keep in mind that the re-identification frequency (%) would be much higher now than it was in 2018.

People are widely split on this issue. Whatever you decide, to opt-in or not, just be sure to do your homework first.

Always read the terms and conditions fully and carefully of anything having to do with genetics.

Genealogy

The bottom line to genetic genealogy is the genealogy aspect. Genealogists want to confirm ancestors and discover more about those ancestors. Some information can only be discovered via DNA testing today, distant Native heritage, for example, breaking through brick walls.

This technology, as it has advanced and more people have tested, has been a godsend for genealogists. The same techniques have allowed other people to locate unknown parents, grandparents and close relatives.

Adoptees

Not only are genealogists identifying people long in the past that are their ancestors, but adoptees and those seeking unknown parents are making discoveries much closer to home. MyHeritage has twice provided thousands of free DNA tests via their DNAQuest program to adoptees seeking their biological family with some amazing results.

The difference between genealogy, which looks back in time several generations, and parent or grand-parent searches is that unknown-parent searches use matches to come forward in time to identify parents, not backwards in time to identify distant ancestors in common.

Adoptee matching is about identifying descendants in common. According to Erlich et al in an October 2018 paper, here, about 60% of people with European ancestry could be identified. With the database growth since that time, that percentage has risen, I’m sure.

You can read more about the adoption search technique and how it is used, here.

Adoptee searches have spawned their own subculture of sorts, with researchers and search angels that specialize in making these connections. Do be aware that while many reunions are joyful, not all discoveries are positively received and the revelations can be traumatic for all parties involved.

There’s ying and yang involved, of course, and the exact same techniques used for identifying biological parents are also used to identify cold-case deceased victims of crime as well as violent criminals, meaning rapists and murderers.

Crimes Solved

The use of genetic genealogy and adoptee search techniques for identifying skeletal remains of crime victims, as well as identifying criminals in order that they can be arrested and removed from the population has resulted in a huge chasm and division in the genetic genealogy community.

These same issues have become popular topics in the press, often authored by people who have no experience in this field, don’t understand how these techniques are applied or function and/or are more interested in a sensational story than in the truth. The word click-bait springs to mind although certainly doesn’t apply equally to all.

Some testers are adamantly pro-usage of their DNA in order to identify victims and apprehend violent criminals. Other testers, not so much and some, on the other end of the spectrum are vehemently opposed. This is a highly personal topic with extremely strong emotions on both sides.

The first such case was the Golden State Killer, which has been followed in the past 18 months or so by another 100+ solved cases.

Regardless of whether or not people want their own DNA to be utilized to identify these criminals and victims, providing closure for families, I suspect the one thing we can all agree on is that we are grateful that these violent criminals no longer live among us and are no longer preying on innocent victims.

I wrote about the Golden State Killer, here, as well as other articles here, here, here and here.

In the genealogy community, various vendors have adopted quite different strategies relating to these kinds of searches, as follows:

  • Ancestry, 23andMe and MyHeritage – have committed to fight all access attempts by law enforcement, including court ordered subpoenas.
  • MyHeritage, Family Tree DNA and GedMatch allow uploads, so forensic kits, meaning kits from deceased remains or rape kits could be uploaded to search for matches, the same as any other kit. Law Enforcement uploads violate the MyHeritage terms of service. Both Family Tree DNA and GEDmatch have special law enforcement procedures in place. All three companies have measures in place to attempt to detect unauthorized forensic uploads.
  • Family Tree DNA has provided a specific Law Enforcement protocol and guidelines for forensic uploads, here. All EU customers were opted out earlier in 2019, but all new or existing non-EU customers need to opt out if they do not want their DNA results available for matching to law enforcement kits.
  • GEDmatch was recently sold to Verogen, a DNA forensics company, with information, here. Currently GEDMatch customers are opted-out of matching for law enforcement kits, but can opt-in. Verogen, upon purchase of GEDmatch, required all users to read the terms and conditions and either accept the terms or delete their kits. Users can also delete their kits or turn off/on law enforcement matching at any time.

New Concerns

Concerns in late 2019 have focused on the potential misuse of genetic matching to potentially target subsets of individuals by despotic regimes such as has been done by China to the Uighurs.

You can read about potential risks here, here and here, along with a recent DoD memo here.

Some issues spelled out in the papers can be resolved by vendors agreeing to cryptographically sign their files when customers download. Of course, this would require that everyone, meaning all vendors, play nice in the sandbox. So far, that hasn’t happened although I would expect that the vendors accepting uploads would welcome cryptographic signatures. That pretty much leaves Ancestry and 23andMe. I hope they will step up to the plate for the good of the industry as a whole.

Relative to the concerns voiced in the papers and by the DoD, I do not wish to understate any risks. There ARE certainly risks of family members being identified via DNA testing, which is, after all, the initial purpose even though the current (and future) uses were not foreseen initially.

In most cases, the cow has already left that barn. Even if someone new chooses not to test, the critical threshold is now past to prevent identification of individuals, at least within the US and/or European diaspora communities.

I do have concerns:

  • Websites where the owners are not known in the genealogical community could be collecting uploads for clandestine purposes. “Free” sites are extremely attractive to novices who tend to forget that if you’re not paying for the product, you ARE the product. Please be very cognizant and leery. Actually, just say no unless you’re positive.
  • Fearmongering and click-bait articles in general will prevent and are already causing knee-jerk reactions, causing potential testers to reject DNA testing outright, without doing any research or reading terms and conditions.
  • That Ancestry and 23andMe, the two major vendors who don’t accept uploads will refuse to add crypto-signatures to protect their customers who download files.

Every person needs to carefully make their own decisions about DNA testing and participating in sharing through third party sites.

Health

Not surprisingly, the DNA testing market space has cooled a bit this past year. This slowdown is likely due to a number of factors such as negative press and the fact that perhaps the genealogical market is becoming somewhat saturated. Although, I suspect that when vendors announce major new tools, their DNA kit sales spike accordingly.

Look at it this way, do you know any serious genealogists who haven’t DNA tested? Most are in all of the major databases, meaning Ancestry, 23andMe, FamilyTreeDNA, MyHeritage and GedMatch.

All of the testing companies mentioned above (except GEDmatch who is not a testing company) now have a Health offering, designed to offer existing and new customers additional value for their DNA testing dollar.

23andMe separated their genealogy and health offering years ago. Ancestry and MyHeritage now offer a Health upgrade. For existing customers, FamilyTreeDNA offers the Cadillac of health tests through Tovana.

I would guess it goes without saying here that if you really don’t want to know about potential health issues, don’t purchase these tests. The flip side is, of course, that most of the time, a genetic predisposition is nothing more and not a death sentence.

From my own perspective, I found the health tests to be informative, actionable and in some cases, they have been lifesaving for friends.

Whoever knew genealogy might save your life.

Innovative Third-Party Tools

Tools, and fads, come and go.

In the genetic genealogy space, over the years, tools have burst on the scene to disappear a few months later. However, the last few years have been won by third party tools developed by well-known and respected community members who have created tools to assist other genealogists.

As we close this decade, these are my picks of the tools that I use almost daily, have proven to be the most useful genealogically and that I feel I just “couldn’t live without.”

And yes, before you ask, some of these have a bit of a learning curve, but if you are serious about genealogy, these are all well worthwhile:

  • GedMatch – offers a wife variety of tools including triangulation, half versus fully identical segments and the ability to see who your matches also match. One of the tools I utilize regularly is segment search to see who else matches me on a specific segment, attached to an ancestor I’m researching. GedMatch, started by genealogists, has lasted more than a decade prior to the sale in December 2019.
  • Genetic Affairs – a barn-burning newcomer developed by Evert-Jan Blom in 2018 wins this years’ “Best” award from me, titled appropriately, the “SNiPPY.”.

Genetic Affairs 2019 SNiPPY Award.png

Genetic Affairs offers clustering, tree building between your matches even when YOU don’t have a tree. You can read more here.

2019 genetic affairs.png

Just today, Genetic Affairs released a new cluster interface with DNAPainter, example shown above.

  • DNAPainter – THE chromosome painter created by Jonny Perl just gets better and better, having added pedigree tree construction this year and other abilities. I wrote a composite instructional article, here.
  • DNAGedcom.com and Genetic.Families, affiliated with DNAAdoption.org – Rob Warthen in collaboration with others provides tools like clustering combined with triangulation. My favorite feature is the gathering of all direct ancestors of my matches’ trees at the various vendors where I’ve DNA tested which allows me to search for common surnames and locations, providing invaluable hints not otherwise available.

Promising Newcomer

  • MitoYDNA – a non-profit newcomer by folks affiliated with DNAAdoption and DNAGedcom is designed to replace YSearch and MitoSearch, both felled by the GDPR ax in 2018. This website allows people to upload their Y and mitochondrial DNA results and compare the values to each other, not just for matching, which you can do at Family Tree DNA, but also to see the values that do and don’t match and how they differ. I’ll be taking MitoYDNA for a test drive after the first of the year and will share the results with you.

The Future

What does the future hold? I almost hesitate to guess.

  • Artificial Intelligence Pedigree Chart – I think that in the not-too-distant future we’ll see the ability to provide testers with a “one and done” pedigree chart. In other words, you will test and receive at least some portion of your genealogy all tidily presented, red ribbon untied and scroll rolled out in front of you like you’re the guest on one of those genealogy TV shows.

Except it’s not a show and is a result of DNA testing, segment triangulation, trees and other tools which narrow your ancestors to only a few select possibilities.

Notice I said, “the ability to.” Just because we have the ability doesn’t mean a vendor will implement this functionality. In fact, just think about the massive businesses built upon the fact that we, as genealogists, have to SEARCH incessantly for these elusive answers. Would it be in the best interest of these companies to just GIVE you those answers when you test?

If not, then these types of answers will rest with third parties. However, there’s a hitch. Vendors generally don’t welcome third parties offering advanced tools and therefore block those tools, even though they are being used BY the customer or with their explicit authorization to massage their own data.

On the other hand, as a genealogist, I would welcome this feature with open arms – because as far as I’m concerned, the identification of that ancestor is just the first step. I get to know them by fleshing out their bones by utilizing those research records.

In fact, I’m willing to pony up to the table and I promise, oh-so-faithfully, to maintain my subscription lifelong if one of those vendors will just test me. Please, please, oh pretty-please put me to the test!

I guess you know what my New Year’s Wish is for this and upcoming years now too😊

What About You?

What do you think the high points of 2019 have been?

How about the decade?

What do you think the future holds?

Do you care to make any predictions?

Are you planning to focus on any particular goal or genealogy problem in 2020?

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Products and Services

Genealogy Research

Fun DNA Stuff

  • Celebrate DNA – customized DNA themed t-shirts, bags and other items

Hit a Genetic Genealogy Home Run Using Your Double-Sided Two-Faced Chromosomes While Avoiding Imposters

Do you want to hit a home run with your DNA test, but find yourself a mite bewildered?

Yep, those matches can be somewhat confusing – especially if you don’t understand what’s going on. Do you have a nagging feeling that you might be missing something?

I’m going to explain chromosome matching, and its big sister, triangulation, step by step to remove any confusion, to help you sort through your matches and avoid imposters.

This article is one of the most challenging I’ve ever written – in part because it’s a concept that I’m so familiar with but can be, and is, misinterpreted so easily. I see mistakes and confusion daily, which means that resulting conclusions stand a good chance of being wrong.

I’ve tried to simplify these concepts by giving you easy-to-use memory tools.

There are three key phrases to remember, as memory-joggers when you work through your matches using a chromosome browser: double-sided, two faces and imposter. While these are “cute,” they are also quite useful.

When you’re having a confusing moment, think back to these memory-jogging key words and walk yourself through your matches using these steps.

These three concepts are the foundation of understanding your matches, accurately, as they pertain to your genealogy. Please feel free to share, link or forward this article to your friends and especially your family members (including distant cousins) who work with genetic genealogy. 

Now, it’s time to enjoy your double-sided, two-faced chromosomes and avoid those imposters:)

Are you ready? Grab a nice cup of coffee or tea and learn how to hit home runs!

Double-Sided – Yes, Really

Your chromosomes really are double sided, and two-faced too – and that’s a good thing!

However, it’s initially confusing because when we view our matches in a chromosome browser, it looks like we only have one “bar” or chromosome and our matches from both our maternal and paternal sides are both shown on our one single bar.

How can this be? We all have two copies of chromosome 1, one from each parent.

Chromosome 1 match.png

This is my chromosome 1, with my match showing in blue when compared to my chromosome, in gray, as the background.

However, I don’t know if this blue person matches me on my mother’s or father’s chromosome 1, both of which I inherited. It could be either. Or neither – meaning the dreaded imposter – especially that small blue piece at left.

What you’re seeing above is in essence both “sides” of my chromosome number 1, blended together, in one bar. That’s what I mean by double-sided.

There’s no way to tell which side or match is maternal and which is paternal without additional information – and misunderstanding leads to misinterpreting results.

Let’s straighten this out and talk about what matches do and don’t mean – and why they can be perplexing. Oh, and how to discover those imposters!

Your Three Matches

Let’s say you have three matches.

At Family Tree DNA, the example chromosome browser I’m using, or at any vendor with a chromosome browser, you select your matches which are viewed against your chromosomes. Your chromosomes are always the background, meaning in this case, the grey background.

Chromosome 1-4.png

  • This is NOT three copies each of your chromosomes 1, 2, 3 and 4.
  • This is NOT displaying your maternal and paternal copies of each chromosome pictured.
  • We CANNOT tell anything from this image alone relative to maternal and paternal side matches.
  • This IS showing three individual people matching you on your chromosome 1 and the same three people matching you in the same order on every chromosome in the picture.

Let’s look at what this means and why we want to utilize a chromosome browser.

I selected three matches that I know are not all related through the same parent so I can demonstrate how confusing matches can be sorted out. Throughout this article, I’ve tried to explain each concept in at least two ways.

Please note that I’m using only chromsomes 1-4 as examples, not because they are any more, or less, important than the other chromosomes, but because showing all 22 would not add any benefit to the discussion. The X chromosome has a separate inheritance path and I wrote about that here.

Let’s start with a basic question.

Why Would I Want to Use a Chromosome Browser?

Genealogists view matches on chromosome browsers because:

  • We want to see where our matches match us on our chromosomes
  • We’d like to identify our common ancestor with our match
  • We want to assign a matching segment to a specific ancestor or ancestral line, which confirmed those ancestors as ours
  • When multiple people match us on the same location on the chromosome browser, that’s a hint telling us that we need to scrutinize those matches more closely to determine if those people match us on our maternal or paternal side which is the first step in assigning that segment to an ancestor

Once we accurately assign a segment to an ancestor, when anyone else matches us (and those other people) on that same segment, we know which ancestral line they match through – which is a great head start in terms of identifying our common ancestor with our new match.

That’s a genetic genealogy home run!

Home Runs 

There are four bases in a genetic genealogy home run.

  1. Determine whether you actually match someone on the same segment
  2. Which is the first step in determining that you match a group of people on the same segment
  3. And that you descend from a common ancestor
  4. The fourth step, or the home run, is to determine which ancestor you have in common, assigning that segment to that ancestor

If you can’t see segment information, you can’t use a chromosome browser and you can’t confirm the match on that segment, nor can you assign that segment to a particular ancestor, or ancestral couple.

The entire purpose of genealogy is to identify and confirm ancestors. Genetic genealogy confirms the paper trail and breaks down even more brick walls.

But before you can do that, you have to understand what matches mean and how to use them.

The first step is to understand that our chromosomes are double-sided and you can’ t see both of your chromosomes at once!

Double Sided – You Can’t See Both of Your Chromosomes at Once

The confusing part of the chromosome browser is that it can only “see” your two chromosomes blended as one. They are both there, but you just can’t see them separately.

Here’s the important concept:

You have 2 copies of chromosomes 1 through 22 – one copy that you received from your mother and one from your father, but you can’t “see” them separately.

When your DNA is sequenced, your DNA from your parents’ chromosomes emerges as if it has been through a blender. Your mother’s chromosome 1 and your father’s chromosome 1 are blended together. That means that without additional information, the vendor can’t tell which matches are from your father’s side and which are from your mother’s side – and neither can you.

All the vendor can tell is that someone matches you on the blended version of your parents. This isn’t a negative reflection on the vendors, it’s just how the science works.

Chromosome 1.png

Applying this to chromosome 1, above, means that each segment from each person, the blue person, the red person and the teal person might match you on either one of your chromosomes – the paternal chromosome or the maternal chromosome – but because the DNA of your mother and father are blended – there’s no way without additional information to sort your chromosome 1 into a maternal and paternal “side.”

Hence, you’re viewing “one” copy of your combined chromosomes above, but it’s actually “two-sided” with both maternal and paternal matches displayed in the chromosome browser.

Parent-Child Matches

Let’s explain this another way.

Chromosome parent.png

The example above shows one of my parents matching me. Don’t be deceived by the color blue which is selected randomly. It could be either parent. We don’t know.

You can see that I match my parent on the entire length of chromosome 1, but there is no way for me to tell if I’m looking at my mother’s match or my father’s match, because both of my parents (and my children) will match me on exactly the same locations (all of them) on my chromosome 1.

Chromosome parent child.png

In fact, here is a combination of my children and my parents matching me on my chromosome 1.

To sort out who is matching on paternal and maternal chromosomes, or the double sides, I need more information. Let’s look at how inheritance works.

Stay with me!

Inheritance Example

Let’s take a look at how inheritance works visually, using an example segment on chromosome 1.

Chromosome inheritance.png

In the example above:

  • The first column shows addresses 1-10 on chromosome 1. In this illustration, we are only looking at positions, chromosome locations or addresses 1-10, but real chromosomes have tens of thousands of addresses. Think of your chromosome as a street with the same house numbers on both sides. One side is Mom’s and one side is Dad’s, but you can’t tell which is which by looking at the house numbers because the house numbers are identical on both sides of the street.
  • The DNA pieces, or nucleotides (T, A, C or G,) that you received from your Mom are shown in the column labeled Mom #1, meaning we’re looking at your mother’s pink chromosome #1 at addresses 1-10. In our example she has all As that live on her side of the street at addresses 1-10.
  • The DNA pieces that you received from your Dad are shown in the blue column and are all Cs living on his side of the street in locations 1-10.

In other words, the values that live in the Mom and Dad locations on your chromosome streets are different. Two different faces.

However, all that the laboratory equipment can see is that there are two values at address 1, A and C, in no particular order. The lab can’t tell which nucleotide came from which parent or which side of the street they live on.

The DNA sequencer knows that it found two values at each address, meaning that there are two DNA strands, but the output is jumbled, as shown in the First and Second read columns. The machine knows that you have an A and C at the first address, and a C and A at the second address, but it can’t put the sequence of all As together and the sequence of all Cs together. What the sequencer sees is entirely unordered.

This happens because your maternal and paternal DNA is mixed together during the extraction process.

Chromosome actual

Click to enlarge image.

Looking at the portion of chromosome 1 where the blue and teal people both match you – your actual blended values are shown overlayed on that segment, above. We don’t know why the blue and the teal people are matching you. They could be matching because they have all As (maternal), all Cs (paternal) or some combination of As and Cs (a false positive match that is identical by chance.)

There are only two ways to reassemble your nucleotides (T, A, C, and G) in order and then to identify the sides as maternal and paternal – phasing and matching.

As you read this next section, it does NOT mean that you must have a parent for a chromosome browser to be useful – but it does mean you need to understand these concepts.

There are two types of phasing.

Parental Phasing

  • Parental Phasing is when your DNA is compared against that of one or both parents and sorted based on that comparison.

Chromosome inheritance actual.png

Parental phasing requires that at least one parent’s DNA is available, has been sequenced and is available for matching.

In our example, Dad’s first 10 locations (that you inherited) on chromosome 1 are shown, at left, with your two values shown as the first and second reads. One of your read values came from your father and the other one came from your mother. In this case, the Cs came from your father. (I’m using A and C as examples, but the values could just as easily be T or G or any combination.)

When parental phasing occurs, the DNA of one of your parents is compared to yours. In this case, your Dad gave you a C in locations 1-10.

Now, the vendor can look at your DNA and assign your DNA to one parent or the other. There can be some complicating factors, like if both your parents have the same nucleotides, but let’s keep our example simple.

In our example above, you can see that I’ve colored portions of the first and second strands blue to represent that the C value at that address can be assigned through parental phasing to your father.

Conversely, because your mother’s DNA is NOT available in our example, we can’t compare your DNA to hers, but all is not lost. Because we know which nucleotides came from your father, the remaining nucleotides had to come from your mother. Hence, the As remain after the Cs are assigned to your father and belong to your mother. These remaining nucleotides can logically be recombined into your mother’s DNA – because we’ve subtracted Dad’s DNA.

I’ve reassembled Mom, in pink, at right.

Statistical/Academic Phasing

  • A second type of phasing uses something referred to as statistical or academic phasing.

Statistical phasing is less successful because it uses statistical calculations based on reference populations. In other words, it uses a “most likely” scenario.

By studying reference populations, we know scientifically that, generally, for our example addresses 1-10, we either see all As or all Cs grouped together.

Based on this knowledge, the Cs can then logically be grouped together on one “side” and As grouped together on the other “side,” but we still have no way to know which side is maternal or paternal for you. We only know that normally, in a specific population, we see all As or all Cs. After assigning strings or groups of nucleotides together, the algorithm then attempts to see which groups are found together, thereby assigning genetic “sides.” Assigning the wrong groups to the wrong side sometimes happens using statistical phasing and is called strand swap.

Once the DNA is assigned to physical “sides” without a parent or matching, we still can’t identify which side is paternal and which is maternal for you.

Statistical or academic phasing isn’t always accurate, in part because of the differences found in various reference populations and resulting admixture. Sometimes segments don’t match well with any population. As more people test and more reference populations become available, statistical/academic phasing improves. 23andMe uses academic phasing for ethnicity, resulting in a strand swap error for me. Ancestry uses academic phasing before matching.

By comparison to statistical or academic phasing, parental phasing with either or both parents is highly accurate which is why we test our parents and grandparents whenever possible. Even if the vendor doesn’t use our parents’ results, we certainly can!

If someone matches you and your parent too, you know that match is from that parent’s side of your tree.

Matching

The second methodology to sort your DNA into maternal and paternal sides is matching, either with or without your parents.

Matching to multiple known relatives on specific segments assigns those segments of your DNA to the common ancestor of those individuals.

In other words, when I match my first cousin, and our genealogy indicates that we share grandparents – assuming we match on the appropriate amount of DNA for the expected relationship – that match goes a long way to confirming our common ancestor(s).

The closer the relationship, the more comfortable we can be with the confirmation. For example, if you match someone at a parental level, they must be either your biological mother, father or child.

While parent, sibling and close relationships are relatively obvious, more distant relationships are not and can occur though unknown or multiple ancestors. In those cases, we need multiple matches through different children of that ancestor to reasonably confirm ancestral descent.

Ok, but how do we do that? Let’s start with some basics that can be confusing.

What are we really seeing when we look at a chromosome browser?

The Grey/Opaque Background is Your Chromosome

It’s important to realize that you will see as many images of your chromosome(s) as people you have selected to match against.

This means that if you’ve selected 3 people to match against your chromosomes, then you’ll see three images of your chromosome 1, three images of your chromosome 2, three images of your chromosome 3, three images of your chromosome 4, and so forth.

Remember, chromosomes are double-sided, so you don’t know whether these are maternal or paternal matches (or imposters.)

In the illustration below, I’ve selected three people to match against my chromosomes in the chromosome browser. One person is shown as a blue match, one as a red match, and one as a teal match. Where these three people match me on each chromosome is shown by the colored segments on the three separate images.

Chromosome 1.png

My chromosome 1 is shown above. These images are simply three people matching to my chromosome 1, stacked on top of each other, like cordwood.

The first image is for the blue person. The second image is for the red person. The third image is for the teal person.

If I selected another person, they would be assigned a different color (by the system) and a fourth stacked image would occur.

These stacked images of your chromosomes are NOT inherently maternal or paternal.

In other words, the blue person could match me maternally and the red person paternally, or any combination of maternal and paternal. Colors are not relevant – in other words colors are system assigned randomly.

Notice that portions of the blue and teal matches overlap at some of the same locations/addresses, which is immediately visible when using a chromosome browser. These areas of common matching are of particular interest.

Let’s look closer at how chromosome browser matching works.

What about those colorful bars?

Chromosome Browser Matching

When you look at your chromosome browser matches, you may see colored bars on several chromosomes. In the display for each chromosome, the same color will always be shown in the same order. Most people, unless very close relatives, won’t match you on every chromosome.

Below, we’re looking at three individuals matching on my chromosomes 1, 2, 3 and 4.

Chromosome browser.png

The blue person will be shown in location A on every chromosome at the top. You can see that the blue person does not match me on chromosome 2 but does match me on chromosomes 1, 3 and 4.

The red person will always be shown in the second position, B, on each chromosome. The red person does not match me on chromosomes 2 or 4.

The aqua person will always be shown in position C on each chromosome. The aqua person matches me on at least a small segment of chromosomes 1-4.

When you close the browser and select different people to match, the colors will change and the stacking order perhaps, but each person selected will always be consistently displayed in the same position on all of your chromosomes each time you view.

The Same Address – Stacked Matches

In the example above, we can see that several locations show stacked segments in the same location on the browser.

Chromosome browser locations.png

This means that on chromosome 1, the blue and green person both match me on at least part of the same addresses – the areas that overlap fully. Remember, we don’t know if that means the maternal side or the paternal side of the street. Each match could match on the same or different sides.

Said another way, blue could be maternal and teal could be paternal (or vice versa,) or both could be maternal or paternal. One or the other or both could be imposters, although with large segments that’s very unlikely.

On chromosome 4, blue and teal both match me on two common locations, but the teal person extends beyond the length of the matching blue segments.

Chromosome 3 is different because all three people match me at the same address. Even though the red and teal matching segments are longer, the shared portion of the segment between all three people, the length of the blue segment, is significant.

The fact that the stacked matches are in the same places on the chromosomes, directly above/below each other, DOES NOT mean the matches also match each other.

The only way to know whether these matches are both on one side of my tree is whether or not they match each other. Do they look the same or different? One face or two? We can’t tell from this view alone.

We need to evaluate!

Two Faces – Matching Can be Deceptive!

What do these matches mean? Let’s ask and answer a few questions.

  • Does a stacked match mean that one of these people match on my mother’s side and one on my father’s side?

They might, but stacked matches don’t MEAN that.

If one match is maternal, and one is paternal, they still appear at the same location on your chromosome browser because Mom and Dad each have a side of the street, meaning a chromosome that you inherited.

Remember in our example that even though they have the same street address, Dad has blue Cs and Mom has pink As living at that location. In other words, their faces look different. So unless Mom and Dad have the same DNA on that entire segment of addresses, 1-10, Mom and Dad won’t match each other.

Therefore, my maternal and paternal matches won’t match each other either on that segment either, unless:

  1. They are related to me through both of my parents and on that specific location.
  2. My mother and father are related to each other and their DNA is the same on that segment.
  3. There is significant endogamy that causes my parents to share DNA segments from their more distant ancestors, even though they are not related in the past few generations.
  4. The segments are small (segments less than 7cM are false matches roughly 50% of the time) and therefore the match is simply identical by chance. I wrote about that here. The chart showing valid cM match percentages is shown here, but to summarize, 7-8 cMs are valid roughly 46% of the time, 8-9 cM roughly 66%, 9-10 cM roughly 91%, 10-11 cM roughly 95, but 100 is not reached until about 20 cM and I have seen a few exceptions above that, especially when imputation is involved.

Chromosome inheritance match.png

In this inheritance example, we see that pink Match #1 is from Mom’s side and matches the DNA I inherited from pink Mom. Blue Match #2 is from Dad’s side and matches the DNA I inherited from blue Dad. But as you can see, Match #1 and Match #2 do not match each other.

Therefore, the address is only half the story (double-sided.)

What lives at the address is the other half. Mom and Dad have two separate faces!

Chromosome actual overlay

Click to enlarge image

Looking at our example of what our DNA in parental order really looks like on chromosome 1, we see that the blue person actually matches on my maternal side with all As, and the teal person on the paternal side with all Cs.

  • Does a stacked match on the chromosome browser mean that two people match each other?

Sometimes it happens, but not necessarily, as shown in our example above. The blue and teal person would not match each other. Remember, addresses (the street is double-sided) but the nucleotides that live at that address tell the real story. Think two different looking faces, Mom’s and Dad’s, peering out those windows.

If stacked matches match each other too – then they match me on the same parental side. If they don’t match each other, don’t be deceived just because they live at the same address. Remember – Mom’s and Dad’s two faces look different.

For example, if both the blue and teal person match me maternally, with all As, they would also match each other. The addresses match and the values that live at the address match too. They look exactly the same – so they both match me on either my maternal or paternal side – but it’s up to me to figure out which is which using genealogy.

Chromosome actual maternal.png

Click to enlarge image

When my matches do match each other on this segment, plus match me of course, it’s called triangulation.

Triangulation – Think of 3

If my two matches match each other on this segment, in addition to me, it’s called triangulation which is genealogically significant, assuming:

  1. That the triangulated people are not closely related. Triangulation with two siblings, for example, isn’t terribly significant because the common ancestor is only their parents. Same situation with a child and a parent.
  2. The triangulated segments are not small. Triangulation, like matching, on small segments can happen by chance.
  3. Enough people triangulate on the same segment that descends from a common ancestor to confirm the validity of the common ancestor’s identity, also confirming that the match is identical by descent, not identical by chance.

Chromosome inheritance triangulation.png

The key to determining whether my two matches both match me on my maternal side (above) or paternal side is whether they also match each other.

If so, assuming all three of the conditions above are true, we triangulate.

Next, let’s look at a three-person match on the same segment and how to determine if they triangulate.

Three Way Matching and Identifying Imposters

Chromosome 3 in our example is slightly different, because all three people match me on at least a portion of that segment, meaning at the same address. The red and teal segments line up directly under the blue segment – so the portion that I can potentially match identically to all 3 people is the length of the blue segment. It’s easy to get excited, but don’t get excited quite yet.

Chromosome 3 way match.png

Given that three people match me on the same street address/location, one of the following three situations must be true:

  • Situation 1- All three people match each other in addition to me, on that same segment, which means that all three of them match me on either the maternal or paternal side. This confirms that we are related on the same side, but not how or which side.

Chromosome paternal.png

In order to determine which side, maternal or paternal, I need to look at their and my genealogy. The blue arrows in these examples mean that I’ve determined these matches to all be on my father’s side utilizing a combination of genealogy plus DNA matching. If your parent is alive, this part is easy. If not, you’ll need to utilize common matching and/or triangulation with known relatives.

  • Situation 2 – Of these three people, Cheryl, the blue bar on top, matches me but does not match the other two. Charlene and David, the red and teal, match each other, plus me, but not Cheryl.

Chromosome maternal paternal.png

This means that at least either my maternal or paternal side is represented, given that Charlene and David also match each other. Until I can look at the identity of who matches, or their genealogy, I can’t tell which person or people descend from which side.

In this case, I’ve determined that Cheryl, my first cousin, with the pink arrow matches me on Mom’s side and Charlene and David, with the blue arrows, match me on Dad’s side. So both my maternal and paternal sides are represented – my maternal side with the pink arrow as well as my father’s side with the blue arrows.

If Cheryl was a more distant match, I would need additional triangulated matches to family members to confirm her match as legitimate and not a false positive or identical by chance.

  • Situation 3 – Of the three people, all three match me at the same addresses, but none of the three people match each other. How is this even possible?

Chromosome identical by chance.png

This situation seems very counter-intuitive since I have only 2 chromosomes, one from Mom and one from Dad – 2 sidesof the street. It is confusing until you realize that one match (Cheryl and me, pink arrow) would be maternal, one would be paternal (Charlene and me, blue arrow) and the third (David and me, red arrows) would have DNA that bounces back and forth between my maternal and paternal sides, meaning the match with David is identical by chance (IBC.)

This means the third person, David, would match me, but not the people that are actually maternal and paternal matches. Let’s take a look at how this works

Chromosome maternal paternal IBC.png

The addresses are the same, but the values that live at the addresses are not in this third scenario.

Maternal pink Match #1 is Cheryl, paternal blue Match #2 is Charlene.

In this example, Match #3, David, matches me because he has pink and blue at the same addresses that Mom and Dad have pink and blue, but he doesn’t have all pink (Mom) nor all blue (Dad), so he does NOT match either Cheryl or Charlene. This means that he is not a valid genealogical match – but is instead what is known as a false positive – identical by chance, not by descent. In essence, a wily genetic imposter waiting to fool unwary genealogists!

In his case, David is literally “two-faced” with parts of both values that live in the maternal house and the paternal house at those addresses. He is a “two-faced imposter” because he has elements of both but isn’t either maternal or paternal.

This is the perfect example of why matching and triangulating to known and confirmed family members is critical.

All three people, Cheryl, Charlene and David match me (double sided chromosomes), but none of them match each other (two legitimate faces – one from each parent’s side plus one imposter that doesn’t match either the legitimate maternal or paternal relatives on that segment.)

Remember Three Things

  1. Double-Sided – Mom and Dad both have the same addresses on both sides of each chromosome street.
  2. Two Legitimate Faces – The DNA values, nucleotides, will have a unique pattern for both your Mom and Dad (unless they are endogamous or related) and therefore, there are two legitimate matching patterns on each chromsome – one for Mom and one for Dad. Two legitimate and different faces peering out of the houses on Mom’s side and Dad’s side of the street.
  3. Two-Faced Imposters – those identical by chance matches which zig-zag back and forth between Mom and Dad’s DNA at any given address (segment), don’t match confirmed maternal and paternal relatives on the same segment, and are confusing imposters.

Are you ready to hit your home run?

What’s Next?

Now that we understand how matching and triangulation works and why, let’s put this to work at the vendors. Join me for my article in a few days, Triangulation in Action at Family Tree DNA, MyHeritage, 23andMe and GedMatch.

We will step through how triangulation works at each vendor. You’ll have matches at each vendor that you don’ t have elsewhere. If you haven’t transferred your DNA file yet, you still have time with the step by step instructions below:

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

DNAPainter Instructions and Resources

DNAPainter garden

DNAPainter is one of my favorite tools because DNAPainter, just as its name implies, facilitates users painting their matches’ segments on their various chromosomes. It’s genetic art and your ancestors provide the paint!

People use DNAPainter in different ways for various purposes. I utilize DNAPainter to paint matches with whom I’ve identified a common ancestor and therefore know the historical “identity” of the ancestors who contributed that segment.

Those colors in the graphic above are segments identified to different ancestors through DNA matching.

DNAPainter includes:

  • The ability to paint or map your chromosomes with your matching segments as well as your ethnicity segments
  • The ability to upload or create trees and mark individuals you’ve confirmed as your genetic ancestors
  • A number of tools including the Shared cM Tool to show ranges of relationships based on your match level and WATO (what are the odds) tool to statistically predict or estimate various positions in a family based on relationships to other known family members

A Repository

I’ve created this article as a quick-reference instructional repository for the articles I’ve written about DNAPainter. As I write more articles, I’ll add them here as well.

  • The Chromosome Sudoku article introduced DNAPainter and how to use the tool. This is a step-by-step guide for beginners.

DNA Painter – Chromosome Sudoku for Genetic Genealogy Addicts

  • Where do you find those matches to paint? At the vendors such as Family Tree DNA, MyHeritage, 23andMe and GedMatch, of course. The Mining Vendor Matches article explains how.

DNAPainter – Mining Vendor Matches to Paint Your Chromosomes

  • Touring the Chromosome Garden explains how to interpret the results of DNAPainter, and how automatic triangulation just “happens” as you paint. I also discuss ethnicity painting and how to handle questionable ancestors.

DNA Painter – Touring the Chromosome Garden

  • You can prove or disprove a half-sibling relationship using DNAPainter – for you and also for other people in your tree.

Proving or Disproving a Half Sibling Relationship Using DNAPainter

  • Not long after Dana Leeds introduced The Leeds Method of clustering matches into 4 groups representing your 4 grandparents, I adapted her method to DNAPainter.

DNAPainter: Painting the Leeds Method Matches

  • Ethnicity painting is a wonderful tool to help identify Native American or minority ancestry segments by utilizing your estimated ethnicity segments. Minority in this context means minority to you.

Native American and Minority Ancestors Identified Using DNAPainter Plus Ethnicity Segments

  • Creating a tree or uploading a GEDCOM file provides you with Ancestral Trees where you can indicate which people in your tree are genetically confirmed as your ancestors.

DNAPainter: Ancestral Trees

  • Of course, the key to DNA painting is to have as many matches and segments as possible identified to specific ancestors. In order to do that, you need to have your DNA working for you at as many vendors as possible that provide you with matching and a chromosome browser. Ancestry does not have a browser or provide specific paintable segment information, but the other major vendors do, and you can transfer Ancestry results elsewhere.

DNAPainter: Painting “Bucketed” Family Tree DNA Maternal and Paternal Family Finder Matches in One Fell Swoop

  • Family Tree DNA offers the wonderful feature of assigning your matches to either a maternal or paternal bucket if you connect 4th cousins or closer on your tree. Until now, there was no way to paint that information at DNAPainter en masse, only manually one at a time. DNAPainter’s new tool facilitates a mass painting of phased, parentally bucketed matches to the appropriate chromosome – meaning that triangulation groups are automatically formed!

Triangulation in Action at DNAPainter

  • DNAPainter provides the ability to triangulate “automatically” when you paint your segments as long as you know which side, maternal or paternal, the match originates. Looking at the common ancestors of your matches on a specific segments tracks that segment back in time to its origins. Painting matches from all vendors who provide segment information facilitates once single repository for walking your DNA information back in time.

DNA Transfers

Some vendors don’t require you to test at their company and allow transfers into their systems from other vendors. Those vendors do charge a small fee to unlock their advanced features, but not as much as testing there.

Ancestry and 23andMe DO NOT allow transfers of DNA from other vendors INTO their systems, but they do allow you to download your raw DNA file to transfer TO other vendors.

Family Tree DNA, MyHeritage and GedMatch all 3 accept files uploaded FROM other vendors. Family Tree DNA and MyHeritage also allow you to download your raw data file to transfer TO other vendors.

These articles provide step-by-step instructions how to download your results from the various vendors and how to upload to that vendor, when possible.

Here are some suggestions about DNA testing and a transfer strategy:

Paint and have fun!!!

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Autosomal DNA Matching Confidence Spectrum

Are you confused about DNA matches and what they mean…different kinds of matches…from different vendors and combined results between vendors.  Do you feel like lions and tigers and bears…oh my?  You’re not alone.

As the vendors add more tools, I’ve noticed recently that along with those tools has come a significant amount of confusion surrounding matches and what they mean.  Add to this issue confusion about the terminology being used within the industry to describe various kinds of matches.  Combined, we now have a verbiage or terminology issue and we have confusion regarding the actual matches and what they mean.  So, as people talk, what they mean, what they are trying to communicate and what they do say can be interpreted quite widely.  Is it any wonder so many people are confused?

I reached out within the community to others who I know are working with autosomal results on a daily basis and often engaged in pioneering research to see how they are categorizing these results and how they are referring to them.

I want to thank Jim Bartlett, Blaine Bettinger, Tim Janzen and David Pike (in surname alphabetical order) for their input and discussion about these topics.  I hope that this article goes a long way towards sorting through the various kinds of matches and what they can and do mean to genetic genealogists – and what they are being called.  To be clear, the article is mine and I have quoted them specifically when applicable.

But first, let’s talk about goals.

Goals

One thing that has become apparent over the past few months is that your goals may well affect how you interpret data.  For example, if you are an adoptee, you’re going to be looking first at your closest matches and your largest segments.  Distant matches and small segments are irrelevant at least until you work with the big pieces.  The theory of low hanging fruit, of course.

If your goal is to verify and generally validate your existing genealogy, you may be perfectly happy with Ancestry’s Circles.  Ancestry Circles aren’t proof, as many people think, but if you’re looking for low hanging fruit and “probably” versus “positively,” Ancestry Circles may be the answer for you.

If you didn’t stop reading after the last sentence, then I’m guessing that “probably” isn’t your style.

If your goal is to prove each ancestor and/or map their segments to your DNA, you’re not going to be at all happy with Ancestry’s lack of segment data – so your confidence and happiness level is going to be greatly different than someone who is just looking to find themselves in circles with other descendants of the same ancestor and go merrily on their way.

If you have already connected the dots on most of your ancestry for the past 4 or 5 generations, and you’re working primarily with colonial ancestors and those born before 1700, you may be profoundly interested in small segment data, while someone else decides to eliminate that same data on their spreadsheet to eliminate clutter.  One person’s clutter is another’s goldmine.

While, technically, the different types of tests and matches carry a different technical confidence level, your personal confidence ranking will be influenced by your own goals and by some secondary factors like how many other people match on a particular segment.

Let’s start by talking about the different kinds of matching.  I’ve been working with my Crumley line, so I’ll be utilizing examples from that project.

Individual Matching, Group Matching and Triangulation

There is a difference between individual matching, group matching and triangulation.  In fact, there is a whole spectrum of matching to be considered.

Individual Matching

Individual matching is when someone matches you.

confidence individual match

That’s great, but one match out of context generally isn’t worth much.  There’s that word, generally, because if there is one thing that is almost always true, it’s that there is an exception to every rule and that exception often has to do with context.  For example, if you’re looking for parents and siblings, then one match is all you need.

If this match happens to be to my first cousin, that alone confirms several things for me, assuming there is not a secondary relationship.  First, it confirms my relationship with my parent and my parent’s descent from their parents, since I couldn’t be matching my first cousin (at first cousin level) if all of the lines between me and the cousin weren’t intact.

confidence cousins

However, if the match is to someone I don’t know, and it’s not a close relative, like the 2nd to 4th cousins shown in the match above, then it’s meaningless without additional information.  Most of your matches will be more distant.  Let’s face it, you have a lot more distant cousins than close cousins.  Many ancestors, especially before about 1900, were indeed, prolific, at least by today’s standards.

So, at this point, your match list looks like this:

confidence match list

Bridget looks pretty lonely.  Let’s see what we can do about that.

Matching Additional People

The first question is “do you share a common ancestor with that individual?”  If yes, then that is a really big hint – but it’s not proof of anything – unless they are a close relative match like we discussed above.

Why isn’t a single match enough for proof?

You could be related to this person through more than one ancestral line – and that happens far more than I initially thought.  I did an analysis some time back and discovered that about 15% of the time, I can confirm a secondary genealogical line that is not related to the first line in my tree.  There were another 7% that were probable – meaning that I can’t identify a second common ancestor with certainty, but the surname and location is the same and a connection is likely.  Another 8% were from endogamous lines, like Acadians, so I’m sure there are multiple lines involved.  And of those matches (minus the Acadians), about 10% look to have 3 genealogical lines, not just two.  The message here – never assume.

When you find one match and identify one common genealogical line, you can’t assume that is how you are genetically related on the segment in question.

Ideally, at this point, you will find a third person who shares the common ancestor and their DNA matches, or triangulates, between you and your original match to prove the connection.  But, circumstances are not always ideal.

What is Triangualtion?

Triangulation on the continuum of confidence is the highest confidence level achievable, outside of close relative matching which is evident by itself without triangulation.

Triangulation is when you match two people who share a common ancestor and all three of you match each other on that same segment.  This means that segment descended to all three of you from that common ancestor.

This is what a match group would look like if Jerry matches both John and Bridget.

confidence example 1 match group

Example 1 – Match Group

The classic definition of triangulation is when three people, A, B and C all match each other on the same segment and share a known, identifiable common ancestor.  Above, we only have two.  We don’t know yet if John matches Bridget.

A matches B
A matches C
B matches C

This is what an exact triangulation group would look like between Jerry, John and Bridget.  Most triangulation matches aren’t exact, meaning the start and/or end segment might be different, but some are exact.

confidence example 2 triangulation group

Example 2 – Triangulation Group

It’s not always possible to prove all three.  Sometimes you can see that Jerry matches Bridget and Jerry matches John, but you have no access to John or Bridget’s kits to verify that they also match each other.  If you are at Family Tree DNA, you can run the ICW (in common with) tool to see if John and Bridget do match each other – but that tool does not confirm that they match on the same segment.

If the individuals involved have uploaded their kits to GedMatch, you have the ability to triangulate because you can see the kit numbers of your matches and you can then run them against each other to verify that they do indeed match each other as well.  Not everyone uploads their kits to GedMatch, so you may wind up with a hybrid combination of triangulated groups (like example 2, above) and matching groups (like example 1, above) on your own personal spreadsheet.

Matching groups (that are not triangulated) are referred to by different names within the community.  Tim Janzen refers to them as clusters of cousins, Blaine as pseudo triangulation and I have called them triangulation groups in the past if any three within the group are proven to be triangulated. Be careful when you’re discussing this, because matching groups are often misstated as triangulated groups.  You’ll want to clarify.

Creating a Match List

Sometimes triangulation options aren’t available to us.  For example, at Family Tree DNA, we can see who matches us, and we can see if they match each other utilizing the ICW tool, but we can’t see specifically where they match each other.  This is considered a match group.  This type of matching is also where a great deal of confusion is introduced because these people do match each other, but they are NOT (yet) triangulated.

What we know is that all of these people are on YOUR match list, but we don’t know that they are on each other’s match lists.  They could be matching you on different sides of your DNA or, if smaller segments, they might be IBC (identical by chance.)

You can run the ICW (in common with) tool at Family Tree DNA for every match you have.  The ICW tool is a good way to see who matches both people in question.  Hopefully, some of your matches will have uploaded trees and you can peruse for common ancestors.

The ICW tool is the little crossed arrows and it shows you who you and that person also match in common.

confidence match list ftdna

You can run the ICW tool in conjunction with the ancestral surname in question, showing only individuals who you have matches in common with who have the Crumley surname (for example) in their ancestral surname list.  This is a huge timesaver and narrows your scope of search immediately.  By clicking on the ICW tool for Ms. Bridget,  you see the list, below of those who match both the person whose account we are signed into and Ms. Bridget, below.

confidence icw ftdna

Another way to find common matches to any individual is to search by either the current surname or ancestral surnames.  The ancestral surname search checks the surnames entered by other participants and shows them in the results box.

In the example above, all of these individuals have Crumley listed in their surnames.  You can see that I’ve sorted by ancestral surname – as Crumley is in that search box.

Now, your match lists looks like this relative to the Crumley line.  Some people included trees and you can find your common ancestor on their tree, or through communications with them directly.  In other cases, no tree but the common surname appears in the surname match list.  You may want to note those results on your match list as well.

confidence match list 2

Of course, the next step is to compare these individuals in a matrix to see who matches who and the chromosome browser to see where they match you, which we’ll discuss momentarily.

Group Matching

The next type of matching is when you have a group of people who match each other, but not necessarily on the same segment of DNA.  These matching groups are very important, especially when you know there is a shared ancestor involved – but they don’t indicate that the people share the same segment, nor that all (or any) of their shared segments are from this particular ancestor.  Triangulation is the only thing that accomplishes proof positive.

This ICW matrix shows some of the Crumley participants who have tested and who matches whom.

confidence icw grid

You can display this grid by matching total cM or by known relationship (assuming the individuals have entered this information) or by predicted relationship range.  The total cMs shared is more important for me in evaluating how closely this person might be related to the other individual.

The Chromosome Browser

The chromosome browser at Family Tree DNA shows matches from the perspective of any one individual.  This means that the background display of the 22 Chromosomes (plus X) is the person all of the matches are comparing against. If you’re signed in to your account, then you are the black background chromosomes, and everyone is being compared against your DNA.  I’m only showing the first 6 chromosomes below.

confidence chromosome browser

You can see where up to 5 individuals match the person you’re comparing them to.  In this case, it looks like they may share a common segment on chromosome 2 among several descendants.  Of course, you’d need to check each of these individuals to insure that they match each other on this same segment to confirm that indeed, it did come from a common ancestor.  That’s triangulation.

When you see a grouping of matches of individuals known to descend from a common ancestor on the same chromosome, it’s very likely that you have a match group (cluster of cousins, pseudo triangulation group) and they will all match each other on that same segment if you have the opportunity to triangulate them, but it’s not absolute.

For example, below we have a reconstructed chromosome 8 of James Crumley, the common ancestor of a large group of people shown based on matches.  In other words, each colored segment represents a match between two people.  I have a lot more confidence in the matches shown with the arrows than the single or less frequent matches.

confidence chromosome 8 match group'

This pseudo triangulation is really very important, because it’s not just a match, and it’s not triangulation.  The more people you have that match you on this segment and that have the same ancestor, the more likely that this segment will triangulate.  This is also where much of the confusion is coming from, because matching groups of multiple descendants on the same segments almost always do triangulate so they have been being called triangulation groups, even when they have not all been triangulated to each other.  Very occasionally, you will find a group of several people with a common ancestor who triangulate to each other on this common segment, except one of a group doesn’t triangulate to one other, but otherwise, they all triangulate to others.

confidence triangulation issue

This situation has to be an error of some sort, because if all of these people match each other, including B, then B really must match D.  Our group discussed this, and Jim Bartlett pointed out that these problem matches are often near the vendor matching threshold (or your threshold if you’re using GedMatch) and if the threshold is lowered a bit, they continue to match.  They may also be a marginal match on the edge, so to speak or they may have a read error at a critical location in their kit.

What “in common with” matching does is to increase your confidence that these are indeed ancestral matches, a cousin cluster, but it’s not yet triangulation.

Ancestry Matches

Ancestry has added another level of matching into the mix.  The difference is, of course, that you can’t see any segment data at all, at Ancestry, so you don’t have anything other than the fact that you do match the other person and if you have a shakey leaf hint, you also share a common ancestor in your trees.

confidence ancestry matches

When three people match each other on any segment (meaning this does not infer a common segment match) and also share a common ancestor in a tree, they qualify to be a DNA Circle.  However, there is other criteria that is weighted and not every group of 3 individuals who match and share an ancestor becomes a DNA Circle.  However, many do and many Circles have significantly more than three individuals.

confidence Phoebe Crumley circle

This DNA Circle is for Phebe Crumley, one of my Crumley ancestors.  In this grouping, I match one close family group of 5 people, and one individual, Alyssa, all of whom share Phebe Crumley in their trees.  As luck would have it, the family group has also tested at Family Tree DNA and has downloaded their results to GedMatch, but as it stands here at Ancestry, with DNA Circle data only…the only thing I can do is to add them to my match list.

confidence match list 3

In case you’re wondering, the reason I only added three of the 5 family members of the Abija group to my match list is because two are children of one of the members and their Crumley DNA is represented through their parent.

While a small DNA Circle like Phebe Crumley’s can be incorrect, because the individuals can indeed be sharing the DNA of a different ancestor, a larger group gives you more confidence that the relationship to that group of people is actually through the common ancestor whose circle you are a member of.  In the example Circle shown below, I match 6 individuals out of a total of 21 individuals who are all interrelated and share Henry Bolton in their tree.

Confidence Henry Bolton circle

New Ancestor Discoveries

Ancestry introduced New Ancestor Discoveries (NADs) a few months ago.  This tool is, unfortunately, misnamed – and although this is a good concept for finding people whose DNA you share, but whose tree you don’t – it’s not mature yet.

The name causes people to misinterpret the “ancestors” given to them as genuinely theirs.  So far, I’ve had a total of 11 NADS and most have been easily proven false.

Here’s how NADs work.  Let’s say there is a DNA Circle, John Doe, of 3 people and you match two of them.  The assumption is that John Doe is also your ancestor because you share the DNA of his descendants.  This is a critically flawed assumption.  For example, in one case, my ancestors sister’s husband is shown as my “new ancestor discovery” because I share DNA with his descendants (through his wife, my ancestor’s sister.)  Like I said, not mature yet.

I have discussed this repeatedly, so let’s just suffice it to say for this discussion, that there is absolutely no confidence in NADs and they aren’t relevant.

Shared Matches

Ancestry recently added a Shared Matches function.

For each person that you match at Ancestry, that is a 4th cousin or closer and who has a high confidence match ranking, you can click on shared matches to see who you and they both match in common.

confidence ancestry shared matches

This does NOT mean you match these people through the same ancestor.  This does NOT mean you match them on the same segment.  I wrote about how I’ve used this tool, but without additional data, like segment data, you can’t do much more with this.

What I have done is to build a grid similar to the Family Tree DNA matrix where I’ve attempted to see who matches whom and if there is someone(s) within that group that I can identify as specifically descending from the same ancestor.  This is, unfortunately, extremely high maintenance for a very low return.  I might add someone to my match list if they matched a group (or circle) or people that match me, whose common ancestor I can clearly identify.

Shared Matches are the lowest item on the confidence chart – which is not to say they are useless.  They can provide hints that you can follow up on with more precise tools.

Let’s move to the highest confidence tool, triangulation groups.

Triangulation Groups

Of course, the next step, either at 23andMe, Family Tree DNA, through GedMatch, or some combination of each, is to compare the actual segments of the individuals involved.  This means, especially at Ancestry where you have no tools, that you need to develop a successful begging technique to convince your matches to download their data to GedMatch or Family Tree DNA, or both.  Most people don’t, but some will and that may be the someone you need.

You have three triangulation options:

  1. If you are working with the Family Inheritance Advanced at 23andMe, you can compare each of your matches with each other. I would still invite my matches to download to GedMatch so you can compare them with people who did not test at 23andMe.
  2. If you are working with a group of people at Family Tree DNA, you can ask them to run themselves against each other to see if they also match on the same segment that they both match you on. If you are a project administrator on a project where they are all members, you can do this cross-check matching yourself. You can also ask them to download their results to GedMatch.
  3. If your matches will download their results to GedMatch, you can run each individual against any other individual to confirm their common segment matches with you and with each other.

In reality, you will likely wind up with a mixture of matches on your match list and not everyone will upload to GedMatch.

Confirming that segments create a three way match when you share a common ancestor constitutes proof that you share that common ancestor and that particular DNA has been passed down from that ancestor to you.

confidence match list 4

I’ve built this confidence table relative to matches first found at Family Tree DNA, adding matches from Ancestry and following them to GedMatch.  Fortunately, the Abija group has tested at all 3 companies and also uploaded their results to GedMatch.  Some of my favorite cousins!

Spectrum of Confidence

Blaine Bettinger built this slide that sums up the tools and where they fall on the confidence range alone, without considerations of your goals and technical factors such as segment size.  Thanks Blaine for allowing me to share it here.

confidence level Blaine

These tools and techniques fall onto a spectrum of confidence, which I’ve tried to put into perspective, below.

confidence level highest to lowest

I really debated how to best show these.  Unfortunately, there is almost always some level of judgment involved. In some cases, like triangulation at the 3 vendors, the highest level is equivalent, but in other cases, like the medium range, it really is a spectrum from lowest to highest within that grouping.

Now, let’s take a look at our matches that we’ve added to our match list in confidence order.

confidence match list 5

As you would expect, those who triangulated with each other using some chromosome browser and share a common ancestor are the highest confidence matches – those 5 with a red Y.  These are followed by matches who match me and each other but not on the same segment (or at least we don’t know that), so they don’t triangulate, at least not yet.

I didn’t include any low confidence matches in this table, but of the lowest ones that are included, the shakey leaf matches at Ancestry that won’t answer inquiries and the matches at FTDNA who do share a common surname but didn’t download their information to be triangulated are the least confident of the group.  However, even those lower confidence matches on this chart are medium, meaning at Ancestry they are in a Circle and at FTDNA, they do match and share a common surname.  At Family Tree DNA, they may eventually fall into a triangulation group of other descendants who triangulate.

Caveats

As always, there are some gotchas.  As someone said in something I read recently, “autosomal DNA is messy.”

Endogamy

Endogamous populations are just a mess.  The problem is that literally, everyone is related to everyone, because the founder population DNA has just been passed around and around for generations with little or no new DNA being introduced.

Therefore, people who descend from endogamous populations often show to be much more closely related than they are in a genealogical timeframe.

Secondly, we have the issue pointed out by David Pike, and that is when you really don’t know where a particular segment came from, because the segment matches both the parents, or in some cases, multiple grandparents.  So, which grandparent did that actual segment that descended to the grandchild descend from?

For people who are from the same core population on both parent’s side, close matches are often your only “sure thing” and beyond that, hopefully you have your parents (at least one parent) available to match against, because that’s the only way of even beginning to sort into family groups.  This is known as phasing against your parents and while it’s a great tool for everyone to use – it’s essential to people who descend from endogamous groups. Endogamy makes genetic genealogy difficult.

In other cases, where you do have endogamy in your line, but only in one of your lines, endogamy can actually help you, because you will immediately know based on who those people match in addition to you (preferably on the same segment) which group they descend from.  I can’t tell you how many rows I have on my spreadsheet that are labeled with the word “Acadian,” “Brethren” and “Mennonite.”  I note the common ancestor we can find, but in reality, who knows which upstream ancestor in the endogamous population the DNA originated with.

Now, the bad news is that Ancestry runs a routine that removes DNA that they feel is too matchy in your results, and most of my Acadian matches disappeared when Ancestry implemented their form of population based phasing.

Identical by Population

There is sometimes a fine line between a match that’s from an ancestor one generation further back than you can go, and a match from generations ago via DNA found at a comparatively high percentage in a particular population.  You can’t tell the difference.  All you know is that you can’t assign that segment to an ancestor, and you may know it does phase against a parent, so it’s valid, meaning not IBC or identical by chance.

Yes, identical by population segment matching is a distinct problem with endogamy, but it can also be problematic with people from the same region of the world but not members of endogamous populations.  Endogamy is a term for the timeframe we’re familiar with.  We don’t know what happened before we know what happened.

From time to time, you’ll begin to see something “odd” happened where a group of segments that you already have triangulated to one ancestor will then begin to triangulate to a second ancestor.  I’m not talking about the normal two groups for every address – one from your Mom’s side and one from your Dad’s.  I’m talking, for example, when my Mom’s DNA in a particular area begins to triangulate to one ancestral group from Germany and one from France.  These clearly aren’t the same ancestors, and we know that one particular “spot” or segment range that I received from her DNA can only come from one ancestor.  But these segment matches look to be breaking that rule.

I created the example below to illustrate this phenomenon.  Notice that the top and bottom 3 all match nicely to me and to each other and share a common ancestor, although not the same common ancestor for the two groups.  However, the range significantly overlaps.  And then there is the match to Mary Ann in the middle whose common ancestor to me is unknown.

confidence IBP example

Generally, we see these on smaller segment groups, and this is indicative that you may be seeing an identical by population group.  Many people lump these IBP (identical by population) groups in with IBC, identical by chance, but they aren’t.  The difference is that the DNA in an IBP group truly is coming from your ancestors – it’s just that two distinct groups of ancestors have the same DNA because at some point, they shared a common ancestor.  This is the issue that “academic phasing” (as opposed to parental phasing) is trying to address.  This is what Ancestry calls “pileup areas” and attempts to weed out of your results.  It’s difficult to determine where the legitimate mathematical line is relative to genealogically useful matches versus ones that aren’t.  And as far as I’m concerned, knowing that my match is “European” or “Native” or “African” even if I can’t go any further is still useful.

Think about this, if every European has between 1 and 4% Neanderthal DNA from just a few Neanderthal individuals that lived more than 20,000 years ago in Europe – why wouldn’t we occasionally trip over some common DNA from long ago that found its way into two different family lines.

When I find these multiple groupings, which is actually relatively rare, I note them and just keep on matching and triangulating, although I don’t use these segments to draw any conclusions until a much larger triangulated segment match with an identified ancestor comes into play.  Confidence increases with larger segments.

This multiple grouping phenomenon is a hint of a story I don’t know – and may never know.  Just because I don’t quite know how to interpret it today doesn’t mean it isn’t valid.  In time, maybe its full story will be revealed.

ROH – Runs of Homozygosity

Autosomal DNA tests test someplace over 500,000 locations, depending on the vendor you select.  At each of those locations, you find a value of either T, A, C or G, representing a specific nucleotide.  Sometimes, you find runs of the same nucleotide, so you will find an entire group of all T, for example.  If either of your parents have all Ts in the same location, then you will match anyone with any combination of T and anything else.

confidence homozygosity example

In the example above, you can see that you inherited T from both your Mom and Dad.  Endogamy maybe?

Sally, although she will technically show as a match, doesn’t really “match” you.  It’s just a fluke that her DNA matches your DNA by hopping back and forth between her Mom’s and Dad’s DNA.  This is not a match my descent, but by chance, or IBC (identical by chance.)  There is no way for you to know this, except by also comparing your results to Sally’s parents – another example of parental phasing.  You won’t match Sally’s parents on this segment, so the segment is IBC.

Now let’s look at Joe.  Joe matches you legitimately, but you can’t tell by just looking at this whether Joe matches you on your Mom’s or Dad’s side.  Unfortunately, because no one’s DNA comes with a zipper or two sides of the street labeled Mom and Dad – the only way to determine how Joe matches you is to either phase against Joe’s parents or see who else Joe matches that you match, preferable on the same segment – in other words – create either a match or ICW group, or triangulation.

Segment Size

Everyone is in agreement about one thing.  Large segments are never IBC, identical by chance.  And I hate to use words like never, so today, interpret never to mean “not yet found.”  I’ve seen that large segment number be defined both 13cM and 15cM and “almost never” over 10cM.  There is currently discussion surrounding the X chromosome and false positives at about this threshold, but the jury is still out on this one.

Most medium segments hold true too.  Medium segment matches to multiple people with the same ancestors almost always hold true.  In fact, I don’t personally know of one that didn’t, but that isn’t to say it hasn’t happened.

By medium segments, most people say 7cM and above.  Some say 5cM and above with multiple matching individuals.

As the segment size decreases, the confidence level decreases too, but can be increased by either multiple matches on that segment from a common proven ancestor or, of course, triangulation.  Phasing against your parent also assures that the match is not IBD.  As you can see, there are tools and techniques to increase your confidence when dealing with small segments, and to eliminate IBC segments.

The issue of small segments, how and when they can be utilized is still unresolved.  Some people simply delete them.  I feel that is throwing the baby away with the bathwater and small segments that triangulate from a common ancestor and that don’t find themselves in the middle of a pileup region that is identical by population or that is known to be overly matchy (near the center of chromosome 6, for example) can be utilized.  In some cases, these segments are proven because that same small segment section is also proven against matches that are much larger in a few descendants.

Tim Janzen says that he is more inclined to look at the number of SNPs instead of the segment size, and his comfort number is 500 SNPs or above.

The flip side of this is, as David Pike mentioned, that the fewer locations you have in a row, the greater the chance that you can randomly match, or that you can have runs of heterozygosity.

No one in our discussion group felt that all small segments were useless, although the jury is still out in terms of consensus about what exactly defines a small segment and when they are legitimate and/or useful.  Everyone of us wants to work towards answers, because for those of us who are dealing with colonial ancestors and have already picked the available low hanging fruit, those tantalizing small segments may be all that is left of the ancestor we so desperately need to identify.

For example, I put together this chart detailing my matching DNA by generation. Interesting, I did a similar chart originally almost exactly three years ago and although it has seemed slow day by day, I made a lot of progress when a couple of brick walls fell, in particular, my Dutch wall thanks to Yvette Hoitink.

If you look at the green group of numbers, that is the amount of shared DNA to be expected at each level.  The number of shared cMs drops dramatically between the 5th and 6th generation from 13 cM which would be considered a reasonable matching level (according to the above discussion) at the 5th generation, and 3.32 cM at the 6th generation level, which is a small segment by anyone’s definition.

confidence segment size vs generation

The 6th generation was born roughly in 1760, and if you look to the white grouping to the right of the green group, you can see that my percentage of known ancestors is 84% in the 5th generation, 80% in the 6th generation, but drops quickly after that to 39, 22 and 3%, respectively.  So, the exact place where I need the most help is also the exact place where the expected amount of DNA drops from 13 to 3.32 cM.  This means, that if anyone ever wants to solve those genealogical puzzles in that timeframe utilizing genetic genealogy, we had better figure out how to utilize those small segments effectively – because it may well be all we have except for the occasional larger sticky segment that is passed intact from an ancestor many generations past.

From my perspective, it’s a crying shame that Ancestry gives us no segment data and it’s sad that 23andMe only gives us 5cM and above.  It’s a blessing that we can select our own threshold at GedMatch.  I’m extremely grateful that FTDNA shows us the small segment matches to 1cM and 500 SNPs if we also match on 20cM total and at least one segment over 7cM.  That’s a good compromise, because small segments are more likely to be legitimate if we have a legitimate match on a larger segment and a known ancestor.  We already discussed that the larger the matching segment, the more likely it is to be valid. I would like to see Family Tree DNA lower the matching threshold within projects.  Surname projects imply that a group of people will be expected to match, so I’d really like to be able to see those lower threshold matches.

I’m hopeful that Family Tree DNA will continue to provide small segment information to us.  People who don’t want to learn how to use or be bothered with small segments don’t have to.  Delete is perfectly legitimate option, but without the data, those of us who are interested in researching how to best utilize these segments, can’t.  And when we don’t have data to use, we all lose.  So, thank you Family Tree DNA.

Coming Full Circle

This discussion brings us full circle once again to goals.

Goals change over time.

My initial reason for testing, the first day an autosomal test could be ordered, was to see if my half-brother was my half-brother.  Obviously for that, I didn’t need matching to other people or triangulation.  The answer was either yes or no, we do match at the half-sibling level, or we don’t.

He wasn’t.  But by then, he was terminally ill, and I never told him.  It certainly explained why I wasn’t a transplant match for him.

My next goal, almost immediately, was to determine which if either my brother or I were the child of my father.  For that, we did need matching to other people, and preferably close cousins – the closer the better.  Autosomal DNA testing was new at that time, and I had to recruit cousins.  Bless those who took pity on me and tested, because I was truly desperate to know.

Suffice it to say that the wait was a roller coaster ride of emotion.

If I was not my father’s child, I had just done 30+ years of someone else’s genealogy – not a revelation I relished, at all.

I was my father’s child.  My brother wasn’t.  I was glad I never told him the first part, because I didn’t have to tell him this part either.

My goal at that point changed to more of a general interest nature as more cousins tested and we matched, verifying different lineages that has been unable to be verified by Y or mtDNA testing.

Then one day, something magical happened.

One of my Y lines, Marcus Younger, whose Y line is a result of a NPE, nonparental event, or said differently, an undocumented adoption, received amazing information.  The paternal Younger family line we believed Marcus descended from, he didn’t.  However, autosomal DNA confirmed that even though he is not the paternal child of that line, he is still autosomally related to that line, sharing a common ancestor – suggesting that he may have been born of a Younger female and given that surname, while carrying the Y DNA of his biological father, who remains unidentified.

Amazingly, the next day, a match popped up that matched me and another Younger relative.  This match descended not from the Younger line, but from Marcus Younger’s wife’s alleged surname family.  I suddenly realized that not only was autosomal DNA interesting for confirming your tree – it could also be used to break down long-standing brick walls.  That’s where I’ve been focused ever since.

That’s a very different goal from where I began, and my current goal utilizes the tools in a very different way than my earlier goals.  Confidence levels matter now, a great deal, where that first day, all I wanted was a yes or no.

Today, my goal, other than breaking down brick walls, is for genetic genealogy to become automated and much easier but without taking away our options or keeping us so “safe” that we have no tools (Ancestry).

The process that will allow us to refine genetic genealogy and group individuals and matches utilizing trees on our desktops will ultimately be the key to unraveling those distant connections.  The data is there, we just have to learn how to use it most effectively, and the key, other than software, is collaboration with many cousins.

Aside from science and technology, the other wonderful aspect of autosomal DNA testing is that is has the potential to unite and often, reunite families who didn’t even know they were families.  I’ve seen this over and over now and I still marvel at this miracle given to us by our ancestors – their DNA.

So, regardless of where you fall on the goals and matching confidence spectrum in terms of genetic genealogy, keep encouraging others to test and keep reaching out and sharing – because it takes a village to recreate an ancestor!  No one can do it alone, and the more people who test and share, the better all of our chances become to achieve whatever genetic genealogy goals we have.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

4 Generation Inheritance Study

I’ve recently had the opportunity to perform two, 4-generation, inheritance studies.

In both of these cases, we have the DNA of 4 generations: grandmother, parent, child and grandchild or grandchildren.  I’ll be using the second study because there are two great-grandchildren to compare.

Let me introduce you to the players.

4 gen pedigree

I wanted, with real data, to address some assertions and assumptions that I see being made periodically in the genetic genealogy community.  We need to know if these hold up to scrutiny, or not.  Besides that, it’s just fun to see what happens to DNA with 4 generations and 5 people to compare.

What kinds of information are we looking to confirm or refute in this study?

1 – That small segments don’t occur within a couple generations, meaning that that DNA can’t be or isn’t broken into small segments that quickly.

2 – That small segments can never be used genealogically and are not useful.

3 – That DNA is most of the time passed in 50% packages.  While this is true in the first generation, meaning a child does receive half of each parent’s DNA, they do not receive 25% of each grandparent’s DNA.

4 – That segments over a certain threshold, like 5 or 7 cM, are all reliable as IBD (identical by descent.)

5 – That segments under a certain threshold, like 5 or 7 cM are all unreliable and should never be used, in fact, cannot ever be used and should be discarded.

6 – That there is a rule that you cannot have more than two crossovers per chromosome.

All individuals tested at Family Tree DNA and we’ll be using the FTDNA chromosome browser for comparisons.

First, let’s look at the amount of expected DNA matching versus the actual amount of DNA matching, per generation.  The entire number of cM being measured is 6766.2, per the ISOGG Autosomal Statistics Wiki page.

Expected vs Actual Inheritance Chart

This chart compares the expected versus actual amount of DNA shared between person 1 and person 2,

Person 1 Person 2 Expected DNA Match cM/% Actual DNA Match
Grandmother Parent (grandmother’s child) 3383.1 / 50% 3384.03 / 50.01%
Grandmother Pink Child (grandmother’s grandchild) 1691.5 / 25% 1670.64 / 24.69%
Grandmother Blue Grandchild (grandmother’s great-grandchild) 845.775 / 12.5% 704.84 / 10.39%
Grandmother Green Grandchild (grandmother’s great-grandchild) 845.775 / 12.5% 842.64 / 12.45%

Chromosome Data

Now, let’s take a look at our chromosome data.  Keep in mind, everyone is being compared to the oldest generation – in this case – the great-grandmother’s DNA.

Legend

  • The background chromosome belongs to the great-grandmother of the youngest generation – meaning everyone is being compared to her.
  • Grandparent = orange – because the child receives 50% of each parent’s DNA, the orange child of the great-grandmother will match her DNA 100%.
  • Grandchild = pink – since the grandchild is being compared to the grandparent, and not their parent, we will see how much of the grandmother’s DNA the pink child received. The dark spaces are the “ghost image” of the grandfather’s DNA – identified by the lack of the grandmother’s DNA in that location.
  • Oldest great grandchild = blue
  • Youngest great grandchild = green

The two great grandchildren are full siblings.  None of the parents involved are related to each other or to other generational spouses.  This has been confirmed both by genealogy pedigree chart and by utilizing the tools at GedMatch for comparisons to each other as well as the “are your parents related” tool.

The first comparison, below, shows the 4 individuals compared to the great grandmother’s DNA at the Family Tree DNA with the match default set at 5cM

4 gen ftdna default

The image below, shows the same individuals after dropping the match criteria to 1cM.  Several small colored segments appear.

4 gen ftdna 1 cm

I downloaded all of the matching data for these individuals into a spreadsheet so that I could work with the actual chromosomal data.  I’m not boring you with that here, but I have used the raw matching data for the actual comparisons.

Crossover

Let’s talk about what a crossover is, because understanding crossovers are important

Crossover example 1 – A crossover is where you start/stop receiving DNA from one grandparent or the other.  This is easy to see if we look at chromosome 1.

4 gen crossover

In this example, the parent is orange and the child is pink but they are both being compared to the grandparent of the pink person, the mother of the orange person.

What this means is that while the orange person will always match the grey background chromosome of their mother, the pink person will only match their grandmother on the portion of the DNA they received from their mother that was from their grandmother.  The pink person received their grandfather’s DNA in some locations, and not their grandmother’s.  Where that transition happens is called a crossover and it is where the colored segment stops, as noted by the arrows above, and the back background begins, indicating no match to the grandmother.

You can see that the matches span the center of the chromosome where the grey area indicates there is no data being read.  There is also a second small grey area to the right of the center.  Ignore these grey areas.  They are in essence DNA deserts where there isn’t enough DNA to be read or useful.  Family Tree DNA (and other vendors) stitch the data on both sides together, so to speak, and matches on both sides of this area are considered to be contiguous matches.

You can see that the pink person has two crossover areas where they stopped receiving DNA from the mother’s mother (background chromosome being compared against) and instead started receiving DNA from the mother’s father.  How do we know that?  There only two people who contributed the orange parent’s DNA that the pink child inherited.  If the pink child did not inherit the orange parent’s Mom’s DNA on this segment, then the pink child had to have inherited the orange parent’s Dad’s DNA.

Crossover example 2 – A second kind of crossover is where you are still receiving DNA from the same parent, but from different ancestors on that parental line

I’ve created a chart to illustrate this phenomenon

The names in the charts at the bottom are the people who tested today.  All of these individuals are known cousins who are from my mother’s side.  The name at the top is the common ancestor of all of the testers.

In the first situation, in locations 1-5, Me, Charlie and David match.  None of the three of us match our cousin, Mary on those locations.  However, moving to locations 6-10, Me, Charlie and Mary match each other, but not David.  Looking at our pedigree charts, we can see that the cousins are matching on different ancestral lines.

4 gen generational crossover

Me, Charlie and David share a wife’s line, Sally (wife of John), that Mary does not share.  Me, Charlie and Mary share common DNA from George, a male further upstream in that line.  George’s son John married Sally.  Mary descends from George through a different child, which is why she does not match any of us on the segments we received from Sally, John’s wife.

Location Me Charlie David Mary
1 Sally Sally Sally No match
2 Sally Sally Sally No match
3 Sally Sally Sally No match
4 Sally Sally Sally No match
5 Sally Sally Sally No match
6 George George No match George
7 George George No match George
8 George George No match George
9 George George No match George
10 George George No match George

If you’re just looking at the question, “do Charlie and I match?” the answer would of course be yes, but until we look at a broader spectrum of cousins, we won’t know that our match is actually from two different people in the same descendancy line and that we have an ancestor crossover between locations 5 and 6.  However, we’re still receiving our DNA from the same parent, but which ancestor of that parent contributed the DNA has switched

How prevalent are crossovers?

Number of Crossover Events

These are all parent/child crossovers where the DNA donor switched.  We can only determine that this happened because we can compare generationally against the grey background great grandmother to the youngest generation

  • Orange parent to Pink child – 49
  • Pink child to Blue child – 47
  • Pink child to Green child – 39

The most segmented chromosome, chromosome 1, has 5 separate matching segments for the blue great grandchild (as compared to the great-grandmother), or 10 crossover events (because neither end was at the beginning or end, although start and end numbers are sometimes “fuzzy”).  You can see where a crossover event occurs when the DNA goes from matching to non-matching.

4 gen chr 1 crossovers

Results

I downloaded all of our matching data into a spreadsheet so that I can work with the segment matches individually.

Looking at the data, there are a few things that jump out immediately:

  • On chromosomes 4 and 14, the pink child received none of the orange grandmother’s DNA. That means that the pink child had to have received the grandfather’s DNA for all of chromosome 15. So, if anyone thinks that the 50% rule really works uniformly across generations – here’s concrete proof that it doesn’t. Furthermore, this occurred for an entire chromosome – twice out of 23 chromosomes, or 8.7% of the time.
  • On chromosome 11, the exact opposite happened. The pink child received all of the grandmother’s chromosome, but barely gave any to their blue child. The blue child received their mother’s DNA in that location. On chromosome 13, the pink child received almost all of the grandmother’s DNA.
  • Please note that while the averages of expected versus inherited DNA work out pretty closely, when averaging across all 23 chromosomes, as shown in the Expected vs Actual Inheritance Chart, the individual chromosomes and how much of which grandparent’s or great-grandparent’s DNA is inherited varies wildly from none to 100%.
  • There are several locations on 10 different chromosomes where the DNA has been passed generationally intact 2 or 3 times, without division.
  • Several small segments have been created within 3 transmission events.There are small green and blue segments on several different chromosomes which reflect very small amounts of the great grandmother’s DNA inherited by the green and blue great-grandchildren. This conclusively dismisses the theory that small segments aren’t ever created within a couple of generations.
  • Chromosome 10 is very choppy, including small blue and green grandchild segments that match the orange grandparent and the great-grandmother without having matches to the pink child. This means that those unconnected blue and green small segments are either identical by chance or there is a read issue with the pink person’s DNA on this chromosome.
  • There are a total of 31 small segments, meaning under 7cM. Of those, a total of 10 do not triangulate, meaning they match the grandmother but they do not match their parent.  The 7 pink segments appear to triangulate, but without another generation of transmission (like the blue and green great-grandchildren), or without the grandfather’s DNA, or without triangulation with a known relative on that segment, it’s impossible to tell for sure. Therefore, 14, or 45% are valid segments and do triangulate.
  • There are a total of 92 chromosomal transmission events that took place, meaning that 23 chromosomes got passed from the background person to their orange child, 23 from the orange child to their pink child, 23 from the pink child to the blue grandchild and 23 from the pink child to the green grandchild.
  • Furthermore, based on this limited study, at least 32.26% of the small segments do not triangulate and are not IBD, but are instead identical by chance.
  • In three instances, the exact DNA (from the great grandmother) was given to both the green and blue great grandchildren. In eight other events, the same DNA, without division, was given from a parent to one child.
  • There are several instances, on chromosomes 3, 4, 9, 14, 15, 16, 20, and 22 where the pink child passed none of their grandmother’s DNA to their child, even though they inherited the grandmother’s DNA.

Individual Chromosomes and Their Messages

I’d like to walk through several chromosomes and chat a little bit about what we’re seeing.

Chromosome 1

4 gen chr 1

First, I’d like to illustrate the difference between chromosome matches at the default level (the first chromosome, above) and at the 1cM level (the lower chromosome.)  At the lower match threshold, you will see additional small segment matches that are not shown at the higher threshold, noted by red arrows.

Let’s take a look at the messages held by our individual chromosomes.

On all of these chromosomes, you’ll see that the orange child matches thier mother, the background person being compared against, exactly, on every location that is measured.  Half of everyone’s DNA comes from their mother, so all of their DNA will match to her on any given chromosome.  Remember, we are only measuring matching DNA (half identical segments) – so the other half of the person’s DNA that matches their father is not shown.

I have left the orange segments in the graphics, even though they all match on the entire chromosome length, so you can see the continuity from generation to generation.  Pink is the orange person’s child, so you can see that the pink child inherited part of the DNA the orange person inherited from their mother, but not all.  The part that is black in the pink row, as compared to the orange segment, means that the pink child inherited that DNA from their grandfather at those locations – and not the grandmother being compared against

In one instance, on chromosome 1, the pink child gave their grandmother’s DNA to both of their children.  You can see that to the far left with the red arrow.

4 gen chr 1 grandmother transmission

You can also see that the blue grandchild only received a small part of their great grandmother’s DNA, but the green grandchild received a much larger segment.

In one area, the pink child clearly received their grandmother’s DNA, but didn’t give any of it to either the blue or green grandchild, shown below at the red arrow.  There is no blue or green matching the great-grandmother’s DNA.

4 gen chr 1 no transmission

To the right of the arrow, top, above, you can see where the pink child contributed their grandmother’s DNA to their blue child, but not to the green child.  The pink child contributed their other parent’s DNA in that instance, bottom, above, because their child does not match their orange mother – so that DNA had to come from the grandfather.

On the chromosome match that includes the smaller segments, below, you can see there are a total of 5 segments not shown with the higher threshold.

4 gen chr 1 small segments

The first two arrows, on the left, point to small segments shared by the blue and green grandchildren with their great-grandmother and their pink parent – so these triangulate and they are fine.

The third arrow, on the right hand side pointing to the green segment that does not match with the pink parent indicates a match that is identical by chance.  We’ll talk more about this in chromosome 3.

The fourth arrow, at the far right, shows a small segment of orange DNA that was passed to their pink child, but the pink child did not pass it on to either of their children.  This segment could be a legitimate segment by descent, but it could also be by chance.  We’ll talk about that more on chromosome 8.

Chromosome 2

4 gen chr 2

Chromosome 2 shows two small segments.  You can see that the pink child gave a significant portion of their grandmother’s DNA to the blue child, but only two small segments to the green child in that region, at the red arrows.  They do triangulate though, because they match their parents.  See how nicely the DNA stacks up between all of the generations.

Chromosome 3

4 gen chr 3

The pink child inherited very little of the grandmother’s DNA in this region.  Of the small amount the pink child did inherit, the pink child gave even less of it to their children.  One small piece to the green grandchild, shown at right, and none to the blue grandchild.

Why, then, is there a lonely blue segment on this comparison chromosome showing that the blue great-grandchild matches their orange grandmother and their great-grandmother, but not their pink parent?  This is the first example of an identical by chance segment (or a read error in the pink parent’s file).

4 gen chr 3 small seg

Three Kinds of DNA Match Segments

There are three kinds of DNA segment matches.

  1. Identical by descent (IBD) where you receive the segment from your ancestors and we can track it as far back up the tree as we have living people. This is the example where the small segment of the great-grandchildren (blue or green) match their parent (pink), their grandparent (orange) and their great-grandmother’s background chromosome being compared against.
  2. Identical by state (IBS) which sometimes is used to mean not identical by descent. What it actually means is that you can still match and receive the DNA from your ancestors, but the segment may be very prevalent in a specific community or ethnic group. An alternative explanation is that the DNA ‘state’ is so common that everyone in that area has it, so it’s virtually useless in identifying ancestors, because you can’t really tell which lines it came from. So IBS does triangulate, because it did come from a common ancestor, but you may match a large number of people at this location. Portions of chromosome 6 are known to fall into this category.  More often than not, I hear IBS used to indicate that there is a match, but the common ancestor isn’t known or hasn’t yet been identified.
  3. Identical by chance (IBC) is where a specific DNA combination is a match, but it’s not a match because it was handed down ancestrally, but simply by the luck of the draw.  Because everyone carries the DNA of both parents, sometimes people can match you by zigzagging back and forth between your father’s and mother’s DNA.  These matches aren’t ancestral, but just by luck or chance.  Shorter matches, meaning small segments, are much more likely to be identical by chance than longer matches. When you have both parents DNA, you can easily eliminate IBC segments because they won’t triangulate – as we have just demonstrated on chromosome 3.

You can read more about this here and here.

Chromosome 4

4 gen chr 4

Chromosome 4 is particularly interesting because the orange person matches their background mother, of course, but apparently their pink child inherited this entire chromosome from the pink person’s grandfather – because the pink person does not match their grandmother – there are no pink matching segments to the background grandmother.

Chromosome 5

On chromosome 5, the pink child matches the grandmother on almost the entire chromosome, except for a small part to the left of center.

4 gen chr 5

You may notice that there is a segment of blue that appears to extend beyond the pink bar at the left arrow – which would mean that the blue area matches the great-grandmother without matching the pink parent.  The segments on the chromosome map are not exactly to scale, and the beginnings and ends are sometimes what is referred to as fuzzy.  This means that they are not exact measurements but that they in essence the absence or presence of DNA in a bucket of a specific size.  If any part of your DNA is in that bucket, then your start or stop segment are the edges of that bucket.  In this case, the entire match is 47.51cM for the pink child and 49.82 for the blue grandchild, so the difference may or may not be relevant.

Although this actually is a small matching segment, or non-matching segment, you would never notice this if you were just looking at the blue grandchild matching to the great grandmother.  It’s only with the introduction of the parent’s pink DNA that you notice that the blue great grandchild’s DNA match with the great grandmother extends beyond that of the parent.

Chromosome 6

4 gen chr 6

Chromosome 6 is rather unremarkable except that the orange person seems to have had a read or file error of some sort.  The orange results are shown in two separate pieces, but we know that the orange person must match their mother 100%.  We know this issue is in the orange person’s file, because their pink child and both of the blue and green grandchildren match the background person, the orange persons’ mother, with no break in their DNA.

Chromosome 7

4 gen chr 7

Chromosome 7 shows another example of 5 generations matching with the stacking of orange, blue, green and pink against the background person’s chromosome, at right.  It also shows another example an identical by chance match, with the blue grandchild showing a match to their great-grandmother but no match to their pink parents, near the center at the red arrow.

Chromosome 8

4 gen chr 8

Chromosome 8 shows another example of the pink child having inherited a small segment of their grandmother’s DNA, but not passing it on to their children.

How do we know if this is a legitimate IBD segment, or if it something else?  Since the pink child will match their mother 100%, and they didn’t pass it on tho their children, how can we prove that the small pink segment where they match their grandmother is  IBD.

How could we prove this one way or the other?

First of all, it probably doesn’t matter, except as a matter of interest – or unless of course this one segment is THE one you need to identify that colonial ancestor.  If this was a normal match, we could just see if the match matched the child and the parent too, which would immediately phase the match against their parent – but we can’t do that when matching to a grandparent because the child will always match their parent 100%.

If you have the grandfather’s DNA at Family Tree DNA, you could compare the pink grandchild to their grandfather. On chromosome 8, the grandfather’s DNA in the pink row is identified by the dark grey – because it’s where the pink grandchild does not match their grandmother – so they must match their grandfather on that segment because their orange parent only had two pieces of DNA to give them, the piece from their mother or the piece from their father.

Therefore, if this is a valid segment, then you won’t see at match in the grandfather’s DNA on same portion of the segment.  If you see a match to both the grandmother and the grandfather, it’s likely that the small segment match to the grandmother is not identical by descent –  you but really don’t know for sure.

How could that be?  I asked David Pike that question and he pointed out that in one case, he discovered that the grandparents both shared the same DNA segment.  The child inherited it from one parent or the other, and passed it on to their child, but since the mother’s and father’s DNA was identical, there is no way to tell which grandparent the segment actually came from.  And in this case, the segment would match both grandparents.  That is a trait of endogamy and of IBS, or identical by population.  If you’re saying, BOO, HISS, about now, I totally understand.

After talking to David, I also realized that if your DNA at those locations just happens to be all homozygous, for example, all Ts, on both sides, for a run of SNPs in a row, and if your parents and grandparents have Ts in either location, you will match them…and anyone else who does too.

So here we have an example of a match that could be IBD if it truly is a small segment by descent and you don’t match the other grandparent at that location.  It could be IBC or IBS (by population) if you match both of your grandparents on this segment – but it might be IBD.  It’s IBD from one and IBC/IBS from the other – but which one is which?

However, since I don’t have the grandfather’s DNA at Family Tree DNA, my only other alternative is to move to GedMatch and create a phased kit for the grandfather by subtracting the grandmother’s DNA from her orange child, which will give me the DNA the orange child received from their father.  Then I can compare the pink grandchild to the grandfather’s phased kit – which is the father’s DNA that the orange child received.  This is fine, even if it is only half of the grandfather’s DNA – it s the half that the pink child’s mother received and passed a portion to the pink child.

I would suggest doing this entire exercise on either Family Tree DNA or on the GedMatch platform, and not jumping back and forth between the two.  The start and stop segments aren’t exactly the same, and sometimes the segments read differently, creating more segments at GedMatch than at FTDNA.  I’m not saying that is wrong, just that it isn’t consistent between the two platforms and when you are dealing with small segments, in particular, you need consistency.

Chromosome 9

4 gen chr 9

On chromosome 9, the pink child received little of the grandmother’s DNA, and gave none of it to their green child.  And yes, if you have a good eye the blue child’s right boundary is slightly beyond the their pink parents – so – you already know what that means.  Either a fuzzy boundary or a slight piece of DNA that happened to match with the great-grandmother identical by chance (IBC.)

Chromosome 10

4 gen chr 10

This chromosome is incredibly interesting because it’s comprised of all small segments.  In fact, this is the exact reason why you NEED to look at the 1cM range.  At the default setting, if there are no matches except the orange person to their mother.  It looks like none of the grandmother’s DNA was passed to the pink child, but in fact, may not be the case.  There are three segments passed to the pink child, although the pink child did not pass these on to either of their children.  See the discussion on segment 8 about how to tell for sure, if you need to.

The blue and green segments, since they do not match their pink parent are not IBD but are instead IBC.  The really interesting part of this is that in one case, the blue and green grandchildren’s DNA matches the orange grandmother on the same segments exactly, but does not match the pink parent.

How can this possible be, you ask, barring a file read issue?  Good question.  Remember, each child inherits half of their parent’s DNA.  In this case, both children apparently inherited the same DNA from both parents, but it wasn’t the orange DNA, but that of the pink child’s father.

It just happened, when the blue and green children’s DNA combined with that of their mother, it just happens to read as a match, for a small segment.  You can read about how this might happen in the article, “How Phasing Works and Determining IBD Versus IBS Matches.”

Unfortunately, all these comparisons can do is to tell us simply what does and does not match – they can’t tell us why.  Sometimes, based on other comparisons, like phasing and triangulation, we can figure out the “why” part of the puzzle – and sometimes, we can’t.

Chromosome 11

4 gen chr 11

On chromosome 11, the pink child inherited all of the grandmother’s DNA through their orange parent, but gave less than half to their green child and a small segment to the blue child.  The pink child gave the exact same segment in the center to both their blue and green children.

Chromosome 12

4 gen chr 12

On chromosome 12, the pink child inherited little of their grandmother’s DNA, but passed every bit of what they inherited to both of their children, shown by the nice stack at right.  The start and stop locations are exact between the three.

However, in addition, we have three small segments where the green and blue grandchildren match their orange grandmother without matching their pink parent – so those are IBC.

Chromosome 13

4 gen chr 13

The pink child inherited almost all of their grandmother’s entire chromosome, except for a very small bit at the far right end.  The pink child passed almost their entire chromosome 13 to their green child, but only a small amount to the blue child.

Chromosome 14

4 gen chr 14

This story is easy.  The pink child inherited their grandfather’s entire chromosome 14 because they do not match their grandmother’s DNA at all.

Chromosome 15

4 gen chr 15

This is a very “normal” chromosome.  The pink child inherited about half of their grandmother’s DNA and gave about half of what they inherited to their green child.  Of course, their blue child got left out altogether – but that looks to be a lot more “normal” than we once thought.

I am skipping chromosome 16-22, because they are more of what you’ve already seen and is, by now, quite familiar  Plus, you can take a look at the full chromosome comparison graphic and do your own analysis.

X Chromosome

The X chromosome is a bit different, and I’d like to take a look at that.

4 gen X

The X chromosome has special inheritance properties that other chromosomes don’t have.  In particular, women inherit an X just like they inherit their other chromosomes from 1-22 – one from Mom and one from Dad.  Men, however, only receive an X from their mother.  Therefore, there are relatives that you cannot inherit any X DNA from.  I wrote about this here and here along with examples and charts.

In this example, the inheritance path is such that it does not affect what can and cannot be inherited since we are comparing to a great-grandmother, but in other situations,  this would not be the case.

One last observation about the X chromosome.  I have found matching on the X to be particularly unreliable, and have found several situations, where, due to those special inheritance properties, we know beyond any doubt that the common ancestor on the X cannot be the same ancestor as has triangulated on the other chromosomes.  So word to the wise – be very vigilant and hesitant to draw conclusions from X matching.  I never utilize the X without corroborating autosomal matches and even then, I’m very reticent.

In Summary

On the average, we do inherit about half of our DNA from in each generation from each ancestral generation.  But the average and the actuality of what happens is two entirely different things.  Averages are made up of all of the outliers, and if you are one of those outliers, the average isn’t really relevant to you.  Kind of reminds me of “one size fits all” which really means “one size fits almost nobody well” and “everyone is some shade of unhappy.”

I wrote about generational inheritance and how it doesn’t always work the way we think, or expect.  It’s very important to pay close attention to your own DNA and not rely on averages unless you have absolutely no other choice – and only then understanding the averages are likely wrong in one direction or the other – but it’s the best we’ve got, under the circumstances.

So what can we apply to our genealogy from this little experiment.

  1. Some of the small segments across 4 generations are valid, meaning identical by descent or IBD.
  2. At least one third of the small segments aren’t valid and are identical by chance, or IBC.
  3. Without some form of triangulation or parental phasing, it’s impossible to tell which small segments are and are not valid, or identical by descent.
  4. Small segments are indeed formed within a 2 or 3 generation span, so they are not always a results of many generations of dividing.
  5. However, the further back in time your ancestor, the more likely that they will only be represented in your DNA by small segments, if any.
  6. Many small segments are valid and are not a result of IBC.  However, most are not and one needs to understand how to recognize signs of an IBC vs an IBD match.
  7. Disregarding small segments uniformly is like throwing away the only clues you may have to your most distant ancestors – which are likely your brick walls.
  8. The largest segment that was not valid was 3.14cM and 600 SNPs.
  9. The smallest valid segment was 1.25cM and 500 SNPs.

Getting the Most Out of Your DNA Experience

There is a lot more information available to us in our DNA results than is first apparent.  It takes a bit of digging and you need to understand how autosomal DNA works in order to ferret out those secrets.  Don’t discount or ignore evidence because it’s more difficult to use – meaning small segments.  The very piece or breadcrumb you need to solve a long-standing mystery may indeed be right there waiting for you.  Learn how to use your DNA information effectively and accurately – including those small segments.

You need to test every cousin you can find and convince to swab or spit.  It’s those cousin matches that help immensely with triangulation and confirming the validity of all DNA segments, matching them back to common ancestors.  You are building walkways or maybe pathways back in time, with your DNA as the steppingstones.  Genetic genealogy is not a one person endeavor.  It takes a village, hopefully of cousins willing to DNA test!

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Autosomal DNA Testing 101 – What Now?

When I first started this blog, my goal was to provide explanations and examples of genetic genealogy topics so that there would be fewer questions and easier answers.

That sounded like a great idea, but the reality of the situation is that the consumer market for autosomal DNA testing has exploded – meaning more and more consumers with more and more questions.  Compounding that situation, the consumers who purchase these tests today, especially on impulse, and mostly I’m referring to Ancestry.com here, often have absolutely no idea what to expect or even what they want except that Ancestry will find their ancestors for them.  That’s because that’s what Ancestry tells them in their advertising.

So, in the big picture, the questions and inquiries that experienced people are currently receiving are becoming less specific and more general and often exhibit a lack of understanding of what DNA testing can do.  It’s frustrating to parties on both sides of the fence, but I’m glad people are asking because it means they are interested and willing to learn.

Rather than approach this topic from a technical perspective of how to work with autosomal DNA, I’d like to talk about what can be done with autosomal DNA testing from a newbie perspective.  The person who just got their results back and are saying to themselves, “OK, now what can I do with this?”

However, there is lots “how to” information in this article for everyone if you click on the links.  If nothing else, this gives you a tool to send to those overly excited newbies who are starry eyed but have no clue how to proceed.  Remember, you were once new too!

This is part 1 of a two part series.  The second part will focus on how to make contact with your matches successfully.  But now, let’s pretend it’s day 1 and you just got your autosomal test results back.

Why Did You Test?

The first question to ask yourself is why did you test in the first place?  If your answer is “because Ancestry had a sale,” that’s fine, but then you’ll need to read all four options to know what you can do with autosomal DNA.

1.  I want to meet other people I’m related to.

Ok, but the first thing here you’re going to have to define is the word “related.”  You are likely related to everyone on your match list.  I said likely, because there may be some people there whose DNA simply matches yours by chance.  For the most part, and especially for those people who are your closest matches, you’re related somehow. The challenge, of course, is to figure out how – meaning through which ancestor.  This is the genealogy jigsaw puzzle of you!

All three of the major vendors, Family Tree DNA, Ancestry and 23andMe show you your closest matches first on your match list.

autosomal 101 FTDNA

Do you want to meet your DNA cousins only if you can identify a common ancestor?  Do you want to work with them on genealogy? The answers to these questions will help sort through the rest of what to do and how.

If your goal is to contact your matches, then Family Tree DNA is the easiest, as they provide you with the e-mail addresses of your matches by clicking on the little envelope for each match on your match page, shown above.

Ancestry is second easiest, but forces you to use their internal message system which often doesn’t deliver the messages.  (Do not send more than 30 in one day or Ancestry will blacklist your messages and block your communications, thinking you are a spammer.)

23andMe is the most difficult as you have to request permission to communicate with each match and also to share DNA and if your match authorizes communication, then you can communicate through 23andMe’s message system.  Sound cumbersome?  It is and the response rate is low.

Confirming Genealogy

Let’s look at another reason for testing.

2.  I want to confirm my genealogy is correct – meaning that my great-grandfather really is my great-grandfather and so forth on up the line.

Well, you’re in luck, especially if some of your cousins, known or otherwise, have tested.  Confirming your genealogy is easier done in closer generations than more distant ones and the more cousins from various lines that have tested, the better.  That’s because you will share more of your DNA with relatives when you have a close common ancestor.

Autosomal DNA is divided approximately in half in each generation, when the child receives half of their DNA from each parent – so the closer your cousin, the more likely you are to share more DNA with them.  The more DNA you share, the more likely you are to be able to identify which ancestor it comes from.  And if a match matches you and your proven cousin both on the same segment, that identifies positively which line that match comes from.  That three way matching is called triangulation.

Let’s talk about the word “confirm.”  Herein lies a challenge, because DNA does have the absolute ability to confirm ancestors, as noted above.  DNA also has the ability to give you hints that go towards a “preponderance of evidence.”  DNA, can also lead you astray if you draw erroneous conclusions – and one vendor provides a tool (or tools) that encourages overstepping conclusions.  Let’s look at each circumstance.

Proof Positive through Triangulation

Just what it says – absolutely unquestionable proof that a particular ancestor is your ancestor.  If you match two other people who also descend from your common ancestors, Joe and Jane Doe, on the same segment of DNA, that is confirmation that you share that ancestor and that segment of your DNA is considered proven to that ancestral line.  This requires two things.  First, that your DNA matches on the same segment AND that you have identified the same ancestors, Joe and Jane Doe, genealogically in your trees.

Now, you probably can’t tell which side of the couple, Jane or Joe, the DNA is from unless you also match two people on just Jane’s side of the family or just Joe’s on that same segment.

One caveat here – counting you and your parent as two of the three people doesn’t work because you and your parents are too close in the tree.  By three people, that would preferably be three people who descend from that couple through three different children.

Here’s an example.

JohnDoe

It would also ideally be more than three people, but three is the minimum to form a triangulation group.  In the real world, these matches might not start and end of the same segments as in the example above, but the overlapping portion should be significant

The example above is proof positive, because the three people descend from the same ancestor, through different children, and match on the same chromosome in the same locations.

This technique is called triangulation.

Now for the bad news – you can’t do this at Ancestry.com, because they don’t provide you with any of the segment information in the last 5 columns.  Ancestry has no chromosome browser, which is the tool that shows you where on your DNA you match your cousins.

Family Tree DNA’s chromosome display tool that is part of their chromosome browser is shown below.

Two cousins browser

On the example above, you can see that Barbara Jean Long, the black background person on the chromosome graphic, is being compared to her two first cousins, the blue and orange on the chromosome graphic.

You can download the information from Family Tree DNA or 23andMe in spreadsheet format, or you can display the information graphically, like in the example above.  You can see the “stacked” locations where both the cousins match the black background person they are being compared to.  You can also see that there are some locations where only one of the cousins matches the background person, like on chromosome 20.  And of course, some locations where neither cousin matches the background person, like on chromosome 21.

If you download that data, the information gives you the locations where the people being compared match the person they are being compared against.

Two cousins combined

The chart above is the download of part of chromosome 1 for Barbara, Cheryl and Donald, siblings who are Barbara’s first cousins.

The areas where the 3 people overlap, or triangulate, are colored in green on the spreadsheet, while the rows entirely in pink or blue do not triangulate – meaning Barbara matches either one cousin or the other, but not both.  Keep in mind that this example only proves their common ancestral couple, which in this case are common grandparents – but the technique is the same no matter which common ancestor you are trying to prove.

This bring us to our next topic, that of close relatives.

Close Relative Matches

I previously said that you can’t use you and a close relative to prove a distant ancestor.  But that’s not necessarily true when the relationship you are trying to prove is closer in time.  The chart below shows the relationships of the example above.

Miller Ferverda chart

In the case shown above, two first cousins who are siblings, Cheryl and Don, are being compared to their common first cousin, Barbara.  Their fathers were siblings and their common ancestors were their grandparents.  This is not 6 generations up a tree where matching is iffy.  You can be expected to match closely with your first cousins where you may not match with more distant cousins, because you simply didn’t inherit any of the same DNA from your distant common ancestor.  You should be sharing about 12.5% of your DNA with first cousins, and if you have first cousins that you’re not matching, that might signal that an undocumented adoption has occurred in one line or the other.

In a case like this, if you and a first cousin match, that suffices to prove a close connection.  If you don’t match, it suffices to raise questions.  A lot of questions.  Big ugly questions.  The next thing to do is to see if any other known cousins have tested and who they match – or don’t match.

For example, if Barbara Ferverda was not the child of John Ferverda, she would not match either Cheryl nor Don, and we’d know there was a problem.  If Cheryl and Don match other Ferverda or Miller relatives and Barbara didn’t, then we’d know the genetic break in the line was on Barbara’s side and not on Cheryl/Don’s side.

This same technique is also how we know which “side” matches are on.  If an unknown match matches both Barbara and Cheryl, for example, it’s a good bet that their common ancestor is someplace in the Miller/Ferverda line.  If they also match another Miller on the same segment, then the common ancestor has been narrowed to the Miller side of the Miller/Ferverda couple.

Unfortunately, not all DNA results are as definitive or easy to prove as these.  Let’s look at some of the more “squishy” results.

Preponderance of Evidence through Aggregated Data

In regular genealogy, there are a range of proofs.  There is direct evidence that someone is the child of an ancestor.  That would be a will, for example, that names a daughter and her husband and maybe even tells where they moved to.  This would be your lucky day!

Think of that will as equivalent to triangulated proof of a common ancestor.  There is just no arguing with the evidence.

If you’re not that lucky, you have to piece the shreds of indirect evidence together to make a story.  In the genealogy world, this is called preponderance of evidence, and I am always, always much less comfortable with this type of evidence than I am with solid proof.

There are various flavors of pieces of evidence in the DNA world. Sometimes we have hints of relationships without proof.

The most common is when you have matches with a group of people who share the same surname, but you can’t get back far enough to find a common ancestor.  Is this a probable match?  Yes?  Guaranteed?  No.  Have I seen them fall apart and the actual match be on another entirely unrelated line?  Yes.  See why I call these squishy?

Ancestry takes this one step further with their DNA Circles.  For a DNA Circle to be created, you must match DNA with someone in the Circle AND everyone in the Circle must match DNA with someone else in the Circle AND everyone in the Circle must have a common ancestor in their tree.  Circles begin with a minimum of three people.  Generally, the more people who match AND have the same ancestor, the stronger the likelihood that you would be able to confirmation the common ancestor of the group as your ancestor too – if you had a chromosome browser type of tool.  Still, Circles alone are not and never will be, proof.  Circles are great hints and along with other research, can confirm genealogical research.  For example, my paper genealogy says I descend from Henry Bolton, and I find myself in Henry Bolton’s tree, matching several other Bolton descendants through Henry’s other children.  Those multiple connections pretty well confirms the paper trail is accurate and no undocumented adoptions have occurred in my line.

Now, the bad news….Circles is predicated upon matching of trees.  If there is a common misconception out there that is replicated in these trees, then people who match will be shown in a Circle predicated on bad information.  And, there is no way to know.  However, people interpret the existence of a DNA Circle as proof positive and that it confirms the tree.  Membership in a DNA Circle is absolutely NOT proof of any kind, let alone proof positive – except that your DNA matches the people who you are connected to by lines and their DNA matches the people they are connected to by lines.  You can see my connections in orange below, and the background connections in light grey.

circle henry bolton matches2

This is an example of my Henry Bolton Circle.  I match 5 different people’s DNA (the orange lines) who also show Henry Bolton as their ancestor.  This does NOT mean the match is on the same segment, so it is NOT triangulated.  This is a grouping of data where multiple people match each other, not a genetic triangulation group where everyone matches on the same segment.  In fact there are cases that I have found where the person I match in a circle is through a different line entirely, so in that case, the presumption of which common ancestor our common DNA is from is incorrect.

I want to be very clear, there is nothing wrong with DNA Circles, so far as they go.  The consumer needs to understand what Circles are really saying – and what they can’t and don’t say.  DNA Circles are another important tool in our arsenal.  We just have to be careful not to assume, or presume, more than is there.  Presuming that we match someone in the Circle because we share Henry Bolton’s DNA may in fact be inaccurate.  We may match on a completely unrelated line – but because we do match and share a common ancestor in our tree – we both find ourselves in the Henry Bolton Circle.

Are you reading those squishy words?  Presume – it’s related to the word assume…right???  And keep in mind that Circles are created based in part on those wonderfully accurate Ancestry trees.  Are you feeling good about this preponderance of evidence yet?

However, in my case, I’ve done due diligence with the genealogy and I have all of my proof ducks in a row.  The fact that I do match so many Bolton descendants confirms my work, along with the fact that at the other vendors and at GedMatch, I  have triangulated my matches and proven the Bolton DNA.  So, this circle is valid but the only proof I have is not found at Ancestry or because I’m a Circle member, but by triangulation and aggregated data using other vendor’s tools.

This next screen shot is of an exact triangulated match using GedMatch’s triangulation tool.  Each line shows me matching two cousins, along with the start and stop segments.  This just happens to be the Ferverda example.  So, I match six people, all on the same segment, all with a known common ancestor.  This is proof positive.  Not all “matching” is nearly so definitive.

Gedmatch triangulation

Sometimes the matches aren’t so neat and tidy. That’s when we move to using aggregated data.

Aggregated Data – What’s That?

Aggregated data is a term I’ve come up with because there isn’t any term to fit in today’s genetic genealogy vocabulary.  In essence, aggregated data is when a group of people (who may or may not know who their common ancestor is) match on common segments of data, but not necessarily on the same segments, or not all of the same segments.  When you have an entire group of these people, they form a stair step “right shift” kind of graph.

The interesting part of this is that by utilizing aggregated data and looking not only at who we match, but who our matches match that share a common ancestor, we can gain insight and hints.  Finding a common ancestor is of course a huge benefit in this type of situation because then you’ve identified at least a DNA “line” for the entire group.

If we were to utilize the triangulation tools at Gedmatch and look at my closest triangulated matches, they would look something like this, where the segments that I match with each person (or in this case, two people) shift some to the right.  What you are seeing is the start and stop match locations, with graphing.  Therefore, I match all of these people that have a common ancestor.

Each match overlaps the one above and below to come extent – and often by a lot.  These are known as triangulation groups (TG).

However, the top match and the bottom match do not overlap, so they don’t triangulate with each other.  They are still valid triangulated matches to me and you can expect to see this kind of matching when using aggregated data.

Understand that when you see your triangulation groups at GedMatch, your mother’s side and your father’s side will be intermixed. In this case, I know the common ancestor and I know many of these testers, so I’m positive that this is a valid grouping (plus, they all match my Mom too – the best test of all.)

gedmatch triang group

Here’s another example only showing three matches.  All three are triangulated to me through the same ancestor, but the locations of the top and bottom matches don’t overlap with each other.  Both overlap the one in the middle in part.

gedmatch overlap

New Ancestor Discoveries – Not Evidence at All

Let’s look at the third reason for DNA testing.

3.  I want to find new ancestors.

Discovering brand new ancestors is a bit tougher.

There are two ways to discover new ancestors.  The first is through triangulation combined with traditional genealogy.  I have done this, but in these cases, I did have a clue as to what I was looking for.  In other words, the new ancestor I discovered was actually confirming a wife’s surname or identifying the parents of an ancestor from several potential candidate couples.

The second way to potentially discover a new ancestor is Ancestry’s New Ancestor Discoveries, NADs, which is really a somewhat misleading name.  What Ancestry has determined is that you match a group of people who share a common ancestor – and Ancestry’s leap of faith is that you share that ancestor do too.  While that may not be correct, what IS very relevant is that you do match this group of people who DO share a common lineage and there is an important hint there for you someplace!  But don’t just accept Ancestry’s discovery as your new ancestor – because there is a good chance it isn’t.  Let’s take a look.

Ancestral Lines Through Triangulation

Let’s go back to the John Doe example.

JohnDoe

Let’s take the worst case scenario.  You’re an adopted and have no information.  But you match an entire group of people in a triangulated group who DO know the identity of their common ancestor.

Does this mean that John Doe is your ancestor?  No.  John Doe could be your ancestor, or he could be the brother of your ancestor, or the uncle of your ancestor.  What this does tell you is that either John Doe is your ancestor, some of John Doe’s ancestors are your ancestors, or you are extremely unlucky and you are matching this entire group by chance.  The larger the segment, the less likely your match will be by chance.  Over 10 cM you’re pretty safe on an individual match and I think you’re safe with triangulated groups well below 10 cM.

Ancestry’s New Ancestor Discoveries

You can make this same type of discovery at Ancestry, but it’s not nearly as easy as Ancestry implies in their ads and you have no segment data to work with, just their match, shown below.

Larimer NAD

“Just take the test and we’ll find your ancestors,” the ad says.  Well, yes and no and “it depends.”

Ancestry went out on a limb a few months ago, right about April Fools Day, and frankly, they fell off the end of the branch by claiming that New Ancestor Discoveries are your missing ancestors found.  While that is clearly an overly optimistic marketing statement, the concept of matching you with people you match who all share a common ancestor is sound – it was the implementation and hyper-marketing that was flawed.

The premise here is that if you match people in a Circle that have a common ancestor, that you too might, please note the word might, share that ancestor – even if that person is not in your tree.  In other words, even if you don’t know who they are.  Just like the John Doe triangulation example above.

Here is my connection to the Larimer DNA Circle, even though I don’t know of a Larimer ancestor.

Larimer NAD circle

Now, the problem is that you might be related to an ancestor on one side upstream several generations, but it’s manifesting itself as a match to that particular couple because several people of that couple’s descendants have tested.  I’ve shown an example of how this might work below.

common unknown ancestor

In this example, you can see that your true common ancestor is unknown to both groups of people, but it’s not Mary Johnson and John Jones, or in my case, not John and Jane Larimer.

However, three descendants of Mary Johnson and John Jones tested, and you match all three.  If you also showed Mary Johnson and John Jones in your tree, then you’d be in a Circle with them at Ancestry.  However, since Mary Johnson and John Jones are NOT your ancestors, they are not in your tree.  Since you match three of their descendants, Ancestry concludes that indeed, Mary Johnson and John Jones must also be your ancestors.

While NADs are inaccurate about half the time, the fact that you do share DNA with the people in this group is important, because someplace, upstream, it’s likely that you share a common ancestor.  It’s also possible that you match these three people through unconnected ancestors upstream and it’s a fluke that they all three also descend from this couple.  And yes, that does happen, especially when all of the people involved have ancestors from the same region.

The first day that Ancestry rolled the New Ancestor Discoveries, I was assigned a couple that could not possibly be my ancestors.  I called them Bad NADs.

In my experience, there are more erroneous NADs out there than good ones.  I knew my original one was bad, as I had proof positive because I have triangulated my other lines.  Then, one day, my bad NAD was gone and now, a few weeks later, I have another assigned NAD couple that I have not been able to prove or disprove – the Larimers.  Truthfully, after the bad NAD fiasco, I haven’t spent a lot of time or effort because without tools, there is no place to go with this unless the people I match will download their results to GedMatch.  I’m hoping that a new tool to be released soon will help.

Here’s how NADs could be useful.  Let’s say that my Larimer matches download to GedMatch and I discover that they also match a triangulated group from my McDowell line.  Well, guess what – my Michael’s McDowell’s wife is unknown.  Might she be a Larimer?  Michael’s mother is also unknown.  Might she be a Larimer?  It gives me a line and a place to begin to work, especially if they share any common geography with my ancestors.

Even if the NADs aren’t my direct ancestors, this is still useful information, because somehow, I probably do connect to these people, even though my hands are somewhat tied.  However, labeling them New Ancestor Discoveries encourages people to jump to highly incorrect conclusions.  This isn’t even in the preponderance of evidence category, let alone proof.  It’s information that you can potentially use with other DNA tools (at GedMatch) and old fashioned genealogy to work on proving a connection to this line.  Nothing more.

So what is the net-net of this? Circles can count in the preponderance of evidence, especially in conjunction with other evidence, but NADs don’t.  Neither are proof.  If we were able to work with the segment data and compare it, we might very well be able to determine more, but Ancestry does not provide a chromosome browser, so we can’t.

Ancestor Chromosome Mapping

4.  I want to map my chromosomes to my ancestors so that I know which of my DNA I inherited from each ancestor.

If this is your DNA testing goal, you certainly did not start by testing with Ancestry.com, because they don’t have any tools to help you do this.  This tends to be a goal that people develop after they really understand what autosomal DNA testing can do for them.  In order to map your genome, you have to have access to segment information and you have to triangulate, or prove, the segments to each ancestor.  So count Ancestry out unless you can talk your matches into downloading their raw data files to either GedMatch or Family Tree DNA.  You’ll be testing with both Family Tree DNA and 23andMe and downloading your match information to a spreadsheet and utilizing the tools at www.gedmatch.com and www.dnagedcom.com.

Just so you get an idea of how much fun this can be, here’s my genome mapped to ancestors a few months ago.  I have more mapped now, but haven’t redone my map utilizing Kitty Cooper’s Tools.

Roberta's ancestor map2

Tips and Tricks for Contact Success

Regardless of which of these goals you had when you tested, or have since developed, now that you know what you can do – most of the options are going to require you to do something – often contacting your matches.

One thing that doesn’t happen is that your new genealogy is not delivered to you gift wrapped and all you have to do is open the box, untie the bow around the scroll, and roll it down the hallway.  That only happens on the genealogy TV shows:)

So join me in a few days for part two of Autosomal DNA Testing 101 – Tips and Tricks for Contact Success.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Parent-Child Non-Matching Autosomal DNA Segments

Recently, I had the opportunity to compare 2 children’s autosomal DNA against both of their parents.  Since children obtain 50% of their DNA from each parent (except for the X chromosome in males), it stands to reason that all valid autosomal matches to these children not only will, but must match one parent or the other.  If not, then the match is not valid – in other words – it’s an identical match by chance.

If you remember, the definition of a match by chance, or IBC (identical by chance) is when someone matches a child but doesn’t match either parent.

This means that the DNA segments, or alleles, just happen to line up so that it reads as a match for the child, by zigzagging back and forth between the DNA of both parents, but it really isn’t a valid genealogical match.

You can read about how this works in my article, How Phasing Works and Determining IBD Versus IBS Matches and also in the article, One Chromosome, Two Sides, No Zipper.

The absolute best way to determine if a match is a valid match or not, valid meaning that the DNA was handed down by ancestors, not a match by chance, is to compare a child’s matches against both parents.  By doing that, we can quickly identify and isolate matches that aren’t real.

IBC

In the example above, you can see that Mom contributed all As to me and Dad contributed all Cs to me.  Joe has alternating As and Cs, so he is a match to me on every location.  However, he only matches my parents on half of their locations, so he is not a match to them, because it’s only chance that caused him to match me on those allele values in that order.

DNA matching programs have to take into consideration both allele values in their match routines, since you carry a value from your mother (A above) and a value from your father (C above), and they are not labeled as to which parent they come from.

Valid matches will also match one parent or the other.  After all, the child received all of their DNA from one parent or the other, so for someone to be a valid genealogical match a child, they must match a parent.

Some time back, when I was matching to my own mother’s DNA, I noticed that I matched her on about 40% of my matches, which left 60% to either be matches to my father or identical by chance.

Notice, I’m not talking about IBS, or identical by state, because that phrase is used to mean both identical by chance and identical by population.  Identical by population means that you did in fact inherit the DNA from an ancestor, but it’s either too far back in time to determine which ancestor, or that segment was present in a specific, probably endogamous population, and you could have inherited it from any number of ancestors.

So, identical by population is identical by descent, but we just can’t tell who we got received that DNA from.

  • IBC – identical by chance – not a valid match – you happen to match someone else on a particular segment, but it’s because the match software is jumping back and forth from your mother’s side to your father’s side.
  • IBD – Identical by descent – you share a common segment of DNA because you and another person(s) inherited that DNA segment from a common ancestor who you can identify
  • IBS – Identical by state – currently used to be both IBC and IBS, where IBS means that you did inherit this DNA from a common ancestor, but it’s so far back you can’t determine who, or that segment is so common within a particular population you could have inherited it from a number of people.

Now a 60-40 parental split is certainly possible, especially if one parent was from an endogamous population, which would mean more matches, or one parent was more recently immigrated from the old country, which would mean fewer matches.

However, without my father’s DNA, which is not available, we’ll never know.

Since that time, I have obtained access to 2 sets of child plus both parents DNA results, so I wanted to take a look at how IBD versus IBC stacked up.  These comparisons were done at Family Tree DNA.

Total Matches Non-Matching Either Parent Percent Non-Matching
Child 1 959 133 13.9
Child 2 1037 133 12.8

Based on other evidence I’ve seen, this percentage seems about right, but the amount of shared DNA and the largest segment size surprised me.  Keep in mind that the smallest possible segment size is 7cM which is Family Tree DNA’s lowest single segment threshold to be counted as a match (assuming you meet the 20cM total threshold first.)  If you match, they show you your matching DNA down to 1cM, but these tables are measurements by the 7cM matching criteria only.

In plain English, this means that in this case, 12% and 13% of these matches were identical by chance, or false matches.  These matches included people who shared up to 57cM of data and the largest block was 15cM.

Largest Shared cM Largest Longest Block
Child 1 46.87 14.38
Child 2 57.06 15.18

Could something else be causing this?  Certainly.  Some of these non-matches could be read errors in the files.  I’d certainly want to take a look at that if any of these became critical.  Another possibility could be that valid match segments are “stitched together” by IBC segments creating longer segments in the child.

An alternative to check validity would be to download the files to GedMatch and see if the pattern continues using the same match criteria.  Of course, testing at multiple labs and downloading the results to compare at GedMatch likely removes the issue of read errors in the first set of files.  And if you really, REALLY, want to know, you can look at the raw data files themselves.

Just so you know, this wasn’t an anomaly with just one high read.  Here are the highest 25 entries from Child 2, or about one fifth of her total mismatches.  Only a few were in the 3-5th cousin range.  None were closer.  Most were 4th or 5th to remote.

non-parent matching relationship range

If you want to do these comparisons yourself, they are easy to do if you have a child and both parents who have tested at Family Tree DNA.

On your Family Finder matches page, at the bottom, in the right corner, there is a button to download matches.

download button

I download the matches into separate spreadsheets for the child, mother and father.  I then color all of the rows pink in the mother’s results, and blue in the father’s results, then copy all three to a common spreadsheet.  You can then sort on the match name and this is what you’ll see.

non-match example

What you’re looking for is white (child) rows that don’t match either a blue row (father) or a pink row (mother.)  Don’t worry about pink or blue rows that don’t have matches. It’s normal for the DNA not to be passed to the child part of the time, so these are expected.

In this example, all white rows matched one parent or the other, except for Winnie Whines.  I colored this row red and added the Comment column where I entered the number of this non-matching entry.  When I’m finished comparing and coloring, then all I have to do is sort that column, bringing all of the nonmatching rows together.  I copied those nonmatching entries into a separate sheet so I could sort those alone and obtained the largest shared and longest segments.  To determine the percent, just divide the total number of nonmatches, in this case, 133, by the child’s total number of matches, in this case, 959, giving a non-parent-match percentage of 13.9%.

So, the take-home message is that not all small segment matches are genealogically irrelevant and not all larger segment matches are genealogically relevant.  Thank goodness we have tools and processes to begin to tell the difference.

So, if you don’t have both parents to compare to, and you’re wondering why you just can’t find a common ancestor with someone you match, the answer might be that they fall into your 12 or 13% that are IBC matches.

If you perform this little exercise, comparing a child to both parents, please feel free to post your results in the comments section along with any commentary about endogamous populations or special circumstances.  It really doesn’t take long, probably about an hour total, and the results are really interesting.  Plus, you’ll have eliminated all those irrelevant matches.

I’ll be writing more about this interesting experiment in coming days.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research