Comparing DNA Results – Different Tests at the Same Testing Company

Several people have asked about different tests at the same DNA testing company. They wondered if matching is affected, meaning whether your matches are different if you have two different tests at the same company. Specifically, they asked if you are better off purchasing a test AT a DNA testing vendor that allows uploads, rather than uploading a test from a different vendor. Does it make a difference to the tester or their matches? Do they have the same matches?

These are great questions, and the answer isn’t conclusive. It varies based on several factors.

Having multiple tests at the same DNA testing company can occur in three ways:

  • The same person tests twice at the same DNA testing company.
  • The same person tests once at the DNA testing company and uploads a test from a different testing company. Only two of the primary four DNA testing companies accept uploads from other vendors – FamilyTreeDNA and MyHeritage.
  • The same person uploads two different files from other DNA testing companies to the DNA testing company in question. For example, the DNA company could be FamilyTreeDNA and the two uploaded DNA files could be from either MyHeritage, 23andMe or Ancestry.

All DNA testing companies allow users to download their raw DNA data files. This enables the tester to upload their DNA file to the vendors who accept uploaded files. Both FamilyTreeDNA and MyHeritage provide matching for free, but advanced tools require a small unlock fee of $19 and $29, respectively.

Testing Company Accepts Uploads from Other Companies Download Upload Instructions
23andMe No Instructions here
Ancestry No Instructions here
FamilyTreeDNA Yes, some Instructions here
MyHeritage Yes, some Instructions here

I wrote about developing a DNA testing and transfer/upload strategy, here, and about which companies accept which tests, here.

Not all DNA files are created equal. Therefore, not all files from vendors are compatible with other vendors for various reasons.

Multiple Tests at the Same DNA Testing Company

I have at least two tests at each of the four major vendors. I did this for research purposes, meaning to write articles to share with you.

If you actually test twice at a vendor, meaning purchase two separate tests and take them yourself, you will have two test results at that testing company. At some companies, specifically 23andMe, if you purchase a new test through their “upgrade” procedure, you won’t have two tests, just the newer one.

However, if you’re testing at the DNA testing company, and also uploading, I generally don’t recommend more than one test at each vendor. All it really does is clog up people’s match lists with no or little additional benefit. At 23andMe, with their restrictions on the size of your match list, if everyone had two tests, the effective match limit would be half of their stated limit of about 1500 matches for earlier testers and about 5000 for current testers with subscriptions.

So, in essence, I’m telling you to “do as I say, not as I do.” We all have better things to do with our money rather pay for the same test twice. If you haven’t tested your Y-DNA or mitochondrial DNA, that’s much more beneficial than two autosomal tests at one vendor.

Chips and Chip Evolution

Before we begin the side-by-side comparison, let’s briefly discuss DNA testing chips and how they work.

Each DNA testing company purchases DNA processing equipment. Illumina is the big dog in this arena. Illumina defines the capacity and structure of each chip. In part, how the testing companies use that capacity, or space on each chip, is up to each company. This means that the different testing companies test many of the same autosomal DNA SNP locations, but not all of the same locations.

Furthermore, the individual testing companies can specify a number of “other” locations to be included on their chip, up to the chip maximum size limit. The testing companies who offer Y-DNA or mitochondrial DNA haplogroups from autosomal tests use part of their chip array space for selected known haplogroup-defining SNP locations. This does NOT mean that Y-DNA or mitochondrial DNA is autosomal, just that the testing company used part of their chip array space to target these SNPs in your genome. Of course, for your most refined haplogroup and Y-DNA or mitochondrial DNA matching, you have to take those specific tests at FamilyTreeDNA .

This means that each testing company includes and reports many of the same, but also some different SNP locations when they scan your DNA.

In the lab, after your DNA is extracted from either your saliva or the cheek swab, it’s placed on this array chip which is then placed in the processing equipment.

There are several steps in processing your DNA. Each DNA location specified on the chip is scanned and read multiple times, and the results are recorded. The final output is the raw DNA results file that you see if/when you download your raw DNA file.

Here’s an example from my file. The RSID is the reference SNP cluster ID which is the naming convention used for specific SNPs. It’s not relevant to you, but it is to the lab, along with the chromosome number and position, which is in essence the address on the chromosome.

In the Result column, your file reports one nucleotide (T, A, C or G) that you inherited from each parent at each tested position. They are not listed in “parent order” because your DNA is not organized in that fashion. There’s no way for the lab to know which nucleotide came from which parent, unless they are the same, of course. You can read about nucleotides, here.

When you upload your raw DNA file to a different DNA testing company (vendor), they have to work with a file that isn’t entirely compatible with the files they generate, or the other files uploaded from other DNA testing companies.

In addition to dealing with different file formats and contents from multiple DNA vendors, companies change their own chips and file structure from time to time. In some cases, it’s a forced change by the chip manufacturer. Other times, the vendors want to include different locations or make improvements. For example, with 23andMe’s focus on health, they probably add new medically related SNP locations regularly. Regardless of why, some DNA files include locations not included in other files and are not 100% compatible.

Looking at the first few entries in my example file above, let’s say that the testing vendor included the first ten positions, but an uploaded file from another company did not. Or perhaps the chip changed, and a different version of the company’s own file contains different positions.

DNA testing companies have to “fill in the blanks” for compatibility, and they do this using a technique called imputation. Illumina forced their customers to adopt imputation in 2017 when they dropped the capacity of their chip. I was initially quite skeptical, but imputation has worked surprisingly well. Some of the matching differences you will see when comparing the results of two different DNA files is a result of imputation.

I wrote about imputation in an early article here. Please note the companies have fixed many issues with imputation and improved matching greatly, but the concepts and imputation processes still apply. The downloaded raw data files are your results BEFORE imputation, meaning that it’s up to any company where you upload to process your raw file in the same way they would process a file that they generated. A lot goes on behind the scenes when you upload a file to a DNA testing company.

At both 23andMe and Ancestry, you know that all of your matches tested there, meaning they did not upload a file from another testing company. You don’t know and can’t tell what chip was utilized when your matches tested. The only way to determine a chip testing version, aside from knowing the date or remembering the chip version from when you tested, is to look at the beginning of the raw data download file, although not all files contain that information.

Ok, now that you understand the landscape, let’s look at my results at each company.

23andMe

I tested twice at 23andMe on two different chip versions, V3 and V4, which tested some different locations of my DNA. Neither of these chips is the current version. I originally tested twice to evaluate the differences between the two test versions which you can read about, here.

23andMe named their ethnicity results Ancestry Composition.

They last updated my V3 test’s Ancestry Composition results on July 28, 2021.

The percentages are shown at left, and the country locations are highlighted at right for my 23andMe V3 test.

Click to enlarge any graphic

The 23andMe V4 test was also updated for the last time on July 28, 2021.

The ethnicity results differ substantially between the two chip versions, even though they were both updated on the same date.

In October of 2020, in an effort to “encourage” their customers to pay for a new test on their V5 chip, 23andMe announced that there would be no ethnicity updates on older tests. So, I really don’t know for sure when my tests were actually updated. Just note how different the results are. It’s also worth mentioning that 23andMe does not show trace amounts on their map, so even though my Indigenous American results were found, they aren’t displayed on the map.

Indigenous is, however, shown in yellow on their DNA Chromosome Painting.

No other testing company restricts updates, penalizing their customers who purchased earlier versions of tests.

Matches at 23andMe

23andMe limits your matches to about 1500 unless you have purchased the current test, including health AND pay for an annual $69 subscription which buys you about 5000 matches. I have not purchased this test.

Your number of actual matches displayed/retained is also affected by how many people you have communicated with, or at least initiated communications with. 23andMe does not roll those people off of your match list.

I have 1803 matches on both of my tests, meaning I’ve reached out to about 300 people who would have otherwise been removed from my match list. 23andMe retains your highest matches, deleting lower matches after you reach the maximum match threshold.

I’ve randomly evaluated several of the same matches at each vendor, at least five maternal and five paternal, separated by a blank row. I wanted to determine whether they match me on the same number of centimorgans, meaning the same amount of DNA, on both tests, and the same number of segments.

Match 23and Me V3 23and Me V4
Patricia 292 cM – 12 segments Same as V3
Joe 148 cM, 8 segments Same
Emily 73 cM, 4 segs 72 cM, 4 seg
Roland 27 cM, 1 seg Same
Ian 62 cM, 4 seg Same
Stacy 469 cM, 16 segments 482 cM, 16 segments
Harold 134 cM, 6 segments Same
Dean 69 cM, 3 seg Same
Carl 95 cM, 4 seg Same
Debbie 83 cM, 4 seg 84 cM, 4 seg

As you can see, the matches are either exact or xclose.

Please note that bolded matches are also found at another company. I will include a summary table at the end comparing the same match across multiple vendors.

23and Me Summary

The 23andMe V3 and V4 match results are very close. Since the match limit is the same, and the results are so close between tests, they are essentially identical in terms of matching.

The ethnicity results are similar, but the V4 test reflects a broader region. Italian baffles me in both versions.

Ethnicity should never be taken at face value at any DNA testing company, especially with smaller percentages which could be noise or a combination of other regions which just happens to resemble Italy, in my case.

I don’t know what type of comparison the current chip would yield since I suspect it has more medical and less genealogical SNPs on board.

Reprocessing Tests

This is probably a good place to note that it’s very expensive for any company to update their customer’s ethnicity results because every single customer’s DNA results file must be completely rerun. Note that this does not mean their DNA itself is retested. The output raw data file is reprocessed using a new algorithm.

Rerunning means reprocessing that specific portion of every test, meaning the vendors must rent “time in the cloud.” We are talking millions of dollars for each run. I don’t know how much it costs per test, but think about the expense if it takes $1 to rerun each test in the vendor’s database. Ancestry has more than 20 million tests.

While we, as consumers, are always chomping at the bit for new and better ethnicity results – the testing companies need to be sure it really is “better,” not just different before they invest the money to reprocess and update results.

This is probably why 23andMe decided to cease updating older kits. The newer tests require a subscription which is recurring revenue.

The same is true when DNA testing companies need to rematch their entire user base. This happens when the criteria for matching changes. For example, Ancestry purged a large number of matches for all of their customers back in 2020. While match algorithm changes necessitate rematching, with associated costs, this change also provided Ancestry with the huge benefit of eliminating approximately half of their customer’s matches. This freed up storage space, either physically in their data center or space rented in the cloud, representing substantial cost-savings.

How long can a DNA testing company reasonably be expected to continue investing in a product which never generates additional revenue but for which the maintenance and reinvestment costs never end?

Ancestry and MyHeritage both hope to offset the expenses of maintaining their customer’s DNA tests and providing free updates by selling subscriptions to their record services. 23andMe wants you to purchase a new test and a yearly subscription. FamilyTreeDNA wants you to purchase a Big Y-DNA and mitochondrial DNA test.

OK, now let’s look at my matches at Ancestry.

Ancestry

I’ve taken two Ancestry tests, V1 and V2. There were some differences, which I wrote about here and here. V2 is no longer the current chip.

Except for 23andMe who wants their customers to purchase their most current test, the other companies no longer routinely announce new chip versions. They just go about their business. The only way you know that a vendor actually changed something is when the other companies who accept uploads suddenly encounter an issue with file formats. It always takes a few weeks to sort that out.

My Ancestry V1 test’s ethnicity results don’t show my Native American ethnicity.

Ancestry results were updated in June 2022

However, my V2 results do include Native American ethnicity.

Matches at Ancestry

I have many more matches on my V1 test at Ancestry because I took steps to preserve my smaller matches when Ancestry initiated its massive purge in 2020. I wrote about that here and here.

Ancestry’s SideView breaks matches down into maternal, paternal, and unassigned based on your side selection. You tell Ancestry which side is which. You may be able to determine which “side” is maternal or paternal either by your ethnicity or shared matches. While SideView is not always accurate, it’s a good place to begin.

Match Category Ancestry V1 Test Ancestry V2 Test
Maternal 15,587 15,116
Paternal 42,247 41,870
Both 2 2
Unassigned 48,999 4,127
Total 106,835 61,115

Ancestry either displays all your matches or your matches by side, which I used to compile the table above. I suspect that Ancestry is not assigning any of the smaller preserved matches to “sides” based on the numbers above.

Ancestry implemented a process called Timber that removes DNA that they feel is “too matchy,” meaning you match enough people in this region that they think it’s a pileup region for you personally, and therefore not useful. In some cases, enough DNA is removed causing that person to no longer be considered a match because they fall beneath the match threshold. I am not a fan of Timber.

Your match amount shown is AFTER Timber has removed those segments. Unweighted shared DNA is your pre-Timber match amount.

You can view the Unweighted shared DNA by clicking on the amount of shared DNA on your match list.

You can read Ancestry’s Matching White Paper, here.

Let’s take a look at my matches. I’ve listed both weighted and unweighted where they are different.

Match Ancestry V1 Ancestry V2
Michael 755 cM, 35 seg 737 cM, 33 seg
Edward 66 cM, 4 seg (unweighted 86 cM) 65 cM, 4 seg (unweighted 86 cM)
Tom 59 cM, 3 seg (unweighted 63) Same
Jonathon 43 cM, 4 seg, (unweighted 52 cM) Same
Matthew 20 cM, 2 seg (unweighted 35 cM) Same
Harold 132 cM, 7 seg 135 cM, 6 seg
Dean 67 cM, 4 seg (unweighted 78 cM) 66 cM, 4 seg (unweighted 78 cM)
Debbie 93 cM, 5 seg Same
Valli 142 cM, 3 seg Same
Jared 20 cM, 1 seg (unweighted 22 cM) Same

Timber only removes DNA when the match is under 90 cM. Almost every match under 90 cM has some DNA removed.

Ancestry Summary

The results of the two Ancestry tests are very close.

In some circumstances, no DNA is removed by Timber, so the unweighted is the same as the weighted. However, in other cases, a significant amount is removed. 15 cM of Matthew’s 35 cM was removed by Timber, reducing his total to 20 cM.

Remember that Ancestry does not show shared matches unless they are greater than 20 cM, which is different than any other DNA testing company.

At one point, Ancestry was selling a health test that was also a genealogy test. That test utilized a different chip that is not accepted for uploads by other vendors. The results of that test might well be different that the “normal” Ancestry tests focused on genealogy. The Ancestry health test is no longer offered.

Companies that Accept Uploads

DNA testing companies that accept uploaded DNA files from other DNA testing companies need to process the uploaded file, just like a file that is generated in their own lab. Of course, they must deal with the differences between uploaded files and their own file format. The processing includes imputation and formulates the uploaded file so that it works with the tools that they provide for their customers, including ethnicity (by whatever name they use) matching, family matching (bucketing), advanced matching, the match matrix, triangulation, AutoClusters, Theories of Family Relativity, and other advanced tools.

Of course, the testing company accepting uploads can only work with the DNA locations provided by the original DNA testing company in the uploaded file.

Matching and some additional tools are free to uploaders, but advanced tools require an inexpensive unlock.

FamilyTreeDNA

I took a test at FamilyTreeDNA, plus uploaded a copy of both of my Ancestry DNA files.

FamilyTreeDNA named their population (ethnicity) test myOrigins and the current version is V3. I wrote about the rollout and comparison in September of 2020, here.

My DNA test taken at FamilyTreeDNA, above, reveals Native American segments that match reference populations found both in North and South America and the Caribbean Islands.

At FamilyTreeDNA, my Ancestry V1 uploaded file results show Native American population matches only in North America.

Interestingly, my Ancestry V1 file processed AT Ancestry did not reveal Native American ancestry, but the same file uploaded to and processed at FamilyTreeDNA did show Native American results, reflecting the difference between the vendors’ internal algorithms and reference populations utilized.

My myOrigins results from my Ancestry V2 uploaded file at FamilyTreeDNA also include my North American Native American segments. The V2 test also showed Native American ethnicity at Ancestry, so clearly something changed in Ancestry’s algorithm, locations tested, and/or reference populations between V1 and V2.

Fortunately, FamilyTreeDNA provides both chromosome painting and a population download file so I can match those Native segments with my autosomal matches to identify which of my ancestors contributed those specific segments.

One of my Native segments is shown in pink on Chromosome1. My mother has a Native segment in exactly the same location, so I know that this segment originated with my mother’s ancestors.

I downloaded the myOrigins population segment file and painted my results at DNAPainter, along with the matches where I can identify our common ancestor. This allowed me to pinpoint the ancestral line that contributed this Native segment in my maternal line. You can read about using DNAPainter, here.

FamilyTreeDNA Matches

I have significantly more matches at FamilyTreeDNA on their test than on either of my Ancestry tests that I uploaded. However, nearly the same number are maternally or paternally assigned through Family Matching, with the remainder unassigned. You can read about Family Matching here.

Match Category FamilyTreeDNA Test Ancestry V1 at FamilyTreeDNA Ancestry V2 at FamilyTreeDNA
Paternal 3,479 3,572 3,422
Maternal 1,549 1,536 1,477
Both 3 3 3
All 8,154 6,397 6,579

Family matching, aka bucketing, automatically assigns my matches as maternal and paternal by linking known relatives to their place in my tree.

I completed the following match chart using my original test taken at FamilyTreeDNA, plus the same match at FamilyTreeDNA for both of my Ancestry tests.

In other words, Cheryl matched me at 467 cM on 21 segments on the original test taken at FamilyTreeDNA. She matched me on 473 cM and 21 segments on my Ancestry V1 test uploaded to FamilyTreeDNA and on 483 cM and 22 segments on the Ancestry V2 test uploaded to FamilyTreeDNA.

Match FamilyTreeDNA Ancestry V1 at FTDNA Ancestry V2 at FTDNA
Cheryl 467 cM, 21 seg 473 cM, 21 seg 483 cM, 22 seg
Patricia 195 cM, 11 seg 189 cM, 11 seg 188 cM, 11 seg
Tom 77 cM, 4 seg 71 cM, 4 seg 76 cM, 4 seg
Thomas 72 cM, 3 seg 71 cM, 3 seg 74 cM, 3 seg
Roland 29 cM, 1 seg 35 cM, 2 seg 35 cM, 2 seg
Rex 62 cM, 4 seg 55 cM, 3 seg 57 cM, 3 seg
Don 395 cM, 18 seg 362 cM, 15 seg 398 cM, 18 seg
Ian 64 cM, 4 seg 56 cM, 4 seg 64 cM, 4 seg
Stacy 490 cM, 18 seg 494 cM, 15 seg 489 cM, 14 seg
Harold 127 cM, 5 cM 133 cM, 6 seg 143 cM, 6 seg
Dean 81 cM, 4 seg 75 cM, 3 seg 83 cM, 4 seg
Carl 103 cM, 4 seg 101 cM, 4 seg 102 cM, 4 seg
Debbie 99 cM, 5 seg 97 cM, 5 seg 99 cM, 5 seg
David 373 cM, 16 seg 435 cM, 19 seg 417 cM, 18 seg
Amos 176 cM, 7 seg 177 cM. 8 seg 177 cM, 7 seg
Buster 387 cM, 15 seg 396 cM, 16 seg 402 cM, 17 seg
Charlene 461 cM, 21 seg 450 cM, 21 seg 448 cM, 20 seg
Carol 65 cM, 6 seg 64 cM, 6 seg 65 cM, 6 seg

I have tested many of my cousins at FamilyTreeDNA and encouraged others to test or upload. I’ve attempted to include enough people so that I can have common matches at least at one other DNA testing company for comparison.

FamilyTreeDNA Summary

The matches are relatively close, with a few being exact.

Interestingly, some of the segment counts are different. In most cases, this results from one segment being broken into multiple segments by one or more of the tests, but not always. In the couple that I checked, the entire segment seems to descend from the same ancestral couple, so the break is likely a result of not all of the same DNA locations being tested, plus the limits of imputation.

MyHeritage

I have two tests at MyHeritage. One taken at MyHeritage, and an uploaded file from FamilyTreeDNA.

MyHeritage displays both ethnicity results and Genetic Groups which maps groups of people that you match. I left the Genetic Groups setting at the highest confidence level. Shifting it to lower displays additional Genetic Groups, some of which overlap with or are within ethnicity regions.

My test taken at MyHeritage, above, shows several ethnicities and Genetic Groups, but no Native American.

My FamilyTreeDNA kit processed at MyHeritage shows the same ethnicity regions, one additional Genetic Group, plus Native American heritage in the Amazon which is rather surprising given that I don’t show Native in North American regions where I’m positive my Native ancestors lived.

MyHeritage Matching

At MyHeritage, I compared the results of the test I took with MyHeritage, and a test I uploaded from FamilyTreeDNA. Fewer than half of my matches can be assigned to a parent via shared matching.

Matches MyHeritage Test FamilyTreeDNA at MyHeritage
Paternal 4,422 6,501
Maternal 2,660 3,655
Total 13,233 16,147

I have rounded my matches at MyHeritage to the closest cM.

Match MyHeritage Test FamilyTreeDNA at MyHeritage
Michael 801 cM, 32 seg 823 cM, 31 segments
Cheryl 467 cM, 23 seg 477 cM, 23 seg
Roland No match 28 cM, 1 seg
Patty 156 cM, 9 seg 151 cM, 9 seg
Rex 43 cM, 4 seg 53 cM, 3 seg
Don 369 cM, 16 seg 382 cM, 17 seg
 
David 449 cM, 17 seg 460 cM, 17 seg
Charlene 454 cM, 23 seg 477 cM, 24 seg
Buster 408 cM, 15 seg 410 cM, 16 seg
Amos 183 cM, 8 seg Same
Carol 78 cM, 6 seg 87 cM, 7 seg

MyHeritage Summary

I was surprised to discover that Roland had no match with the MyHeritage test, but did with the FamilyTreeDNA test. I wonder if this is a searching or matching glitch, especially since both companies use the same chip. 28 cM in one segment is a reasonably large match, and even if it was divided in two, it would still be over the matching threshold. I know this is a valid match because Roland triangulates with me and several cousins, I’m positive of our common ancestor, and he also matches me at both FamilyTreeDNA and 23andMe.

Other than that, the matches are reasonably close, with one being exact.

Your Matches Aren’t Everyplace

I unsuccessfully searched for someone who was a match to me in all four databases. Ancestry does not permit match downloads, so I had to search manually. People don’t always use the same names in different databases.

Surprisingly, I was unable to find one match who is in all of the databases. Many people only suggest testing at Ancestry because they have the largest database, but if you look at the following comparison chart that I’ve created, you’ll see that 16 of 26 people, or 62% were not at Ancestry. Conversely, many people were at Ancestry and not elsewhere. I could not find five maternal and five paternal matches at Ancestry that I could identify as matches in another database. 40% were not elsewhere.

If you think for one minute that it doesn’t matter for genealogy if you’re in all four major databases, please reconsider. It surely does matter.

Every single vendor has matches that the others don’t. Substantial, important matches. I have found first and second-cousin matches in every database that weren’t elsewhere.

Many of the original testers have passed away and can’t test again. My mother can never test at either 23andMe or Ancestry, but she is at both FamilyTreeDNA and MyHeritage because I could upgrade her kit at FamilyTreeDNA after she died. I uploaded her to MyHeritage. Of course, because she is a generation closer to our ancestors, she has many valuable matches that I don’t.

Each vendor provides either an email address or a messaging platform for you to contact your matches. Don’t be discouraged if they don’t answer. Just today, I received a reply that was years in the making.

Genealogists hope for immediate gratification, but we are actually in this for the long game. Play it with every tool at your disposal.

The Answer

Does it matter if you test at a DNA testing company, or upload a file?

I know this was a very long answer to what my readers hoped was a simple yes or no question.

There is no consistent answer at either FamilyTreeDNA or MyHeritage, the two DNA testing companies that accept uploads. Be sure you’re in both databases. My closest two matches that I did not test were found at MyHeritage. Here’s a direct link to upload at MyHeritage.

Of the vendors, those two should be the closest to each other because they are both processed in the GenebyGene lab, but again, the actual chip version, when the test was originally taken, and each vendor’s internal processing will result in differences. Neither the original test at the DNA testing company nor the uploaded files have consistently higher or lower matches. Neither type of test or upload appears to be universally more or less accurate. Differences in either direction seem to occur on a match-by-match basis. Many are so close as to be virtually equivalent, with a few seemingly random exceptions. Of course, we always have to consider Timber.

If you upload, unlock the advanced features at both FamilyTreeDNA and MyHeritage.

If you upload to a DNA testing company, you may discover in the future that some features and functions will only be available to original testers.

Personally, if I had the option, I would test at the company directly simply because it eliminates or at least reduces the possibility of future incompatibilities – with the exception of 23andMe which has chosen to not provide consistent updates to older tests. I’m incredibly grateful I didn’t test my mother or now deceased family members at 23andMe, and only there. I would be heartsick, heartbroken, and furious.

Our DNA is an extremely valuable resource for our genealogy. It’s the gift that truly keeps on giving, day after day, even when other records don’t exist. Be sure you and your family members are in each database one way or another, and test your Y-DNA (for males) and mitochondrial DNA (for everyone) to have a complete arsenal at your disposal.

_____________________________________________________________

Follow DNAexplain on Facebook, here.

Share the Love!

You’re always welcome to forward articles or links to friends and share on social media.

If you haven’t already subscribed (it’s free,) you can receive an email whenever I publish by clicking the “follow” button on the main blog page, here.

You Can Help Keep This Blog Free

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Uploads

Genealogy Products and Services

My Book

Genealogy Books

Genealogy Research

DNA: In Search of…Signs of Endogamy

This is the fourth in our series of articles about searching for unknown close family members, specifically; parents, grandparents, or siblings. However, these same techniques can be applied by genealogists to ancestors further back in time as well.

In this article, we discuss endogamy – how to determine if you have it, from what population, and how to follow the road signs.

After introductions, we will be covering the following topics:

  • Pedigree collapse and endogamy
  • Endogamous groups
  • The challenge(s) of endogamy
  • Endogamy and unknown close relatives (parents, grandparents)
  • Ethnicity and Populations
  • Matches
  • AutoClusters
  • Endogamous Relationships
  • Endogamous DNA Segments
  • “Are Your Parents Related?” Tool
  • Surnames
  • Projects
  • Locations
  • Y DNA, Mitochondrial DNA, and Endogamy
  • Endogamy Tools Summary Tables
    • Summary of Endogamy Tools by Vendor
    • Summary of Endogamous Populations Identified by Each Tool
    • Summary of Tools to Assist People Seeking Unknown Parents and Grandparents

What Is Endogamy and Why Does It Matter?

Endogamy occurs when a group or population of people intermarry among themselves for an extended period of time, without the introduction of many or any people from outside of that population.

The effect of this continual intermarriage is that the founders’ DNA simply gets passed around and around, eventually in small segments.

That happens because there is no “other” DNA to draw from within the population. Knowing or determining that you have endogamy helps make sense of DNA matching patterns, and those patterns can lead you to unknown relatives, both close and distant.

This Article

This article serves two purposes.

  • This article is educational and relevant for all researchers. We discuss endogamy using multiple tools and examples from known endogamous people and populations.
  • In order to be able to discern endogamy when we don’t know who our parents or grandparents are, we need to know what signs and signals to look for, and why, which is based on what endogamy looks like in people who know their heritage.

There’s no crystal ball – no definitive “one-way” arrow, but there are a series of indications that suggest endogamy.

Depending on the endogamous population you’re dealing with, those signs aren’t always the same.

If you’re sighing now, I understand – but that’s exactly WHY I wrote this article.

We’re covering a lot of ground, but these road markers are invaluable diagnostic tools.

I’ve previously written about endogamy in the articles:

Let’s start with definitions.

Pedigree Collapse and Endogamy

Pedigree collapse isn’t the same as endogamy. Pedigree collapse is when you have ancestors that repeat in your tree.

In this example, the parents of our DNA tester are first cousins, which means the tester shares great-grandparents on both sides and, of course, the same ancestors from there on back in their tree.

This also means they share more of those ancestors’ DNA than they would normally share.

John Smith and Mary Johnson are both in the tree twice, in the same position as great-grandparents. Normally, Tester Smith would carry approximately 12.5% of each of his great-grandparents’ DNA, assuming for illustration purposes that exactly 50% of each ancestor’s DNA is passed in each generation. In this case, due to pedigree collapse, 25% of Tester Smith’s DNA descends from John Smith, and another 25% descends from Mary Johnson, double what it would normally be. 25% is the amount of DNA contribution normally inherited from grandparents, not great-grandparents.

While we may find first cousin marriages a bit eyebrow-raising today, they were quite common in the past. Both laws and customs varied with the country, time, social norms, and religion.

Pedigree Collapse and Endogamy is NOT the Same

You might think that pedigree collapse and endogamy is one and the same, but there’s a difference. Pedigree collapse can lead to endogamy, but it takes more than one instance of pedigree collapse to morph into endogamy within a population. Population is the key word for endogamy.

The main difference is that pedigree collapse occurs with known ancestors in more recent generations for one person, while endogamy is longer-term and systemic in a group of people.

Picture a group of people, all descended from Tester Smith’s great-grandparents intermarrying. Now you have the beginnings of endogamy. A couple hundred or a few hundred years later, you have true endogamy.

In other words, endogamy is pedigree collapse on a larger scale – think of a village or a church.

My ancestors’ village of Schnait, in Germany, is shown above in 1685. One church and maybe 30 or 40 homes. According to church and other records, the same families had inhabited this village, and region, for generations. It’s a sure bet that both pedigree collapse and endogamy existed in this small community.

If pedigree collapse happens over and over again because there are no other people within the community to marry, then you have endogamy. In other words, with endogamy, you assuredly DO have historical pedigree collapse, generally back in time, often before you can identify those specific ancestors – because everyone descends from the same set of founders.

Endogamy Doesn’t Necessarily Indicate Recent Pedigree Collapse

With deep, historic endogamy, you don’t necessarily have recent pedigree collapse, and in fact, many people do not. Jewish people are a good example of this phenomenon. They shared ancestors for hundreds or thousands of years, depending on which group we are referring to, but in recent, known, generations, many Jewish people aren’t related. Still, their DNA often matches each other.

The good news is that there are telltale signs and signals of endogamy.

The bad news is that not all of these are obvious, meaning as an aid to people seeking clues about unknown close relatives, and other “signs” aren’t what they are believed to be.

Let’s step through each endogamy identifier, or “hint,” and then we will review how we can best utilize this information.

First, let’s take a look at groups that are considered to be endogamous.

Endogamous Groups

Jewish PeopleSpecifically groups that were isolated from other groups of Jewish (and other) people; Ashkenazi (Germany, Northern France, and diaspora), Sephardic (Spanish, Iberia, and diaspora), Mizrahi (Israel, Middle Eastern, and diaspora,) Ethiopian Jews, and possibly Jews from other locations such as Mountain Jews from Kazakhstan and the Caucasus.

AcadiansDescendants of about 60 French families who settled in “Acadia” beginning about 1604, primarily on the island of Nova Scotia, and intermarried among themselves and with the Mi’kmaq people. Expelled by the English in 1755, they were scattered in groups to various diasporic regions where they continued to intermarry and where their descendants are found today. Some Acadians became the Cajuns of Louisiana.

Anabaptist Protestant FaithsAmish, Mennonite, and Brethren (Dunkards) and their offshoots are Protestant religious sects founded in Europe in the 14th, 15th, and 16th centuries on the principle of baptizing only adults or people who are old enough to choose to follow the faith, or rebaptizing people who had been previously baptized as children. These Anabaptist faiths tend to marry within their own group or church and often expel those who marry outside of the faith. Many emigrated to the American colonies and elsewhere, seeking religious freedom. Occasionally those groups would locate in close proximity and intermarry, but not marry outside of other Anabaptist denominations.

Native American (Indigenous) People – all indigenous peoples found in North and South America before European colonization descended from a small number of original founders who probably arrived at multiple times.

Indigenous Pacific Islanders – Including indigenous peoples of Australia, New Zealand, and Hawaii prior to colonization. They are probably equally as endogamous as Native American people, but I don’t have specific examples to share.

Villages – European or other villages with little inflow or whose residents were restricted from leaving over hundreds of years.

Other groups may have significant multiple lines of pedigree collapse and therefore become endogamous over time. Some people from Newfoundland, French Canadians, and Mormons (Church of Jesus Christ of Latter-Day Saints) come to mind.

Endogamy is a process that occurs over time.

Endogamy and Unknown Relatives

If you know who your relatives are, you may already know you’re from an endogamous population, but if you’re searching for close relatives, it’s helpful to be able to determine if you have endogamous heritage, at least in recent generations.

If you know nothing about either parent, some of these tools won’t help you, at least not initially, but others will. However, as you add to your knowledge base, the other tools will become more useful.

If you know the identity of one parent, this process becomes at least somewhat easier.

In future articles, we will search specifically for parents and each of your four grandparents. In this article, I’ll review each of the diagnostic tools and techniques you can use to determine if you have endogamy, and perhaps pinpoint the source.

The Challenge

People with endogamous heritage are related in multiple, unknown ways, over many generations. They may also be related in known ways in recent generations.

If both of your parents share the SAME endogamous culture or group of relatives:

  • You may have significantly more autosomal DNA matches than people without endogamy, unless that group of people is under-sampled. Jewish people have significantly more matches, but Native people have fewer due to under-sampling.
  • You may experience a higher-than-normal cM (centiMorgan) total for estimated relationships, especially more distant relationships, 3C and beyond.
  • You will have many matches related to you on both your maternal and paternal sides.
  • Parts of your autosomal DNA will be the same on both your mother’s and father’s sides, meaning your DNA will be fully identical in some locations. (I’ll explain more in a minute.)

If either (or both) of your parents are from an endogamous population, you:

  • Will, in some cases, carry identifying Y and mitochondrial DNA that points to a specific endogamous group. This is true for Native people, can be true for Jewish people and Pacific Islanders, but is not true for Anabaptist people.

One Size Does NOT Fit All

Please note that there is no “one size fits all.”

Each or any of these tools may provide relevant hints, depending on:

  • Your heritage
  • How many other people have tested from the relevant population group
  • How many close or distant relatives have tested
  • If your parents share the same heritage
  • Your unique DNA inheritance pattern
  • If your parents, individually, were fully endogamous or only partly endogamous, and how far back generationally that endogamy occurred

For example, in my own genealogy, my maternal grandmother’s father was Acadian on his father’s side. While I’m not fully endogamous, I have significantly more matches through that line proportionally than on my other lines.

I have Brethren endogamy on my mother’s side via her paternal grandmother.

Endogamous ancestors are shown with red stars on my mother’s pedigree chart, above. However, please note that her maternal and paternal endogamous ancestors are not from the same endogamous population.

However, I STILL have fewer matches on my mother’s side in total than on my father’s side because my mother has recent Dutch and recent German immigrants which reduces her total number of matches. Neither of those lines have had as much time to produce descendants in the US, and Europe is under-sampled when compared with the US where more people tend to take DNA tests because they are searching for where they came from.

My father’s ancestors have been in the US since it was a British Colony, and I have many more cousins who have tested on his side than mother’s.

If you looked at my pedigree chart and thought to yourself, “that’s messy,” you’d be right.

The “endogamy means more matches” axiom does not hold true for me, comparatively, between my parents – in part because my mother’s German and Dutch lines are such recent immigrants.

The number of matches alone isn’t going to tell this story.

We are going to need to look at several pieces and parts for more information. Let’s start with ethnicity.

Ethnicity and Populations

Ethnicity can be a double-edged sword. It can tell you exactly nothing you couldn’t discern by looking in the mirror, or, conversely, it can be a wealth of information.

Ethnicity reveals the parts of the world where your ancestors originated. When searching for recent ancestors, you’re most interested in majority ethnicity, meaning the 50% of your DNA that you received from each of your parents.

Ethnicity results at each vendor are easy to find and relatively easy to understand.

This individual at FamilyTreeDNA is 100% Ashkenazi Jewish.

If they were 50% Jewish, we could then estimate, and that’s an important word, that either one of their parents was fully Jewish, and not the other, or that two of their grandparents were Jewish, although not necessarily on the same side.

On the other hand, my mother’s ethnicity, shown below, has nothing remarkable that would point to any majority endogamous population, yet she has two.

The only hint of endogamy from ethnicity would be her ~1% Americas, and that isn’t relevant for finding close relatives. However, minority ancestry is very relevant for identifying Native ancestors, which I wrote about, here.

You can correlate or track your ethnicity segments to specific ancestors, which I discussed in the article, Native American & Minority Ancestors Identified Using DNAPainter Plus Ethnicity Segments, here.

Since I wrote that article, FamilyTreeDNA has added the feature of ethnicity or population Chromosome Painting, based on where each of your populations fall on your chromosomes.

In this example on chromosome 1, I have European ancestry (blue,) except for the pink Native segment, which occurs on the following segment in the same location on my mother’s chromosome 1 as well.

Both 23andMe, and FamilyTreeDNA provide chromosome painting AND the associated segment information so you can identify the relevant ancestors.

Ancestry is in the process of rolling out an ethnicity painting feature, BUT, it has no segment or associated matching information. While it’s interesting eye candy, it’s not terribly useful beyond the ethnicity information that Ancestry already provides. However, Jonny Perl at DNAPainter has devised a way to estimate Ancestry’s start and stop locations, here. Way to go Jonny!

Now all you need to do is convince your Ancestry matches to upload their DNA file to one of the three databases, FamilyTreeDNA, MyHeritage, and GEDMatch, that accept transfers, aka uploads. This allows matching with segment data so that you can identify who matches you on that segment, track your ancestors, and paint your ancestral segments at DNAPainter.

I provided step-by-step instructions, here, for downloading your raw DNA file from each vendor in order to upload the file to another vendor.

Ethnicity Sides

Three of the four DNA testing vendors, 23andMe, FamilyTreeDNA, and recently, Ancestry, attempt to phase your ethnicity DNA, meaning to assign it to one parental “side” or the other – both in total and on each chromosome.

Here’s Ancestry’s SideView, where your DNA is estimated to belong to parent 1 and parent 2. I detailed how to determine which side is which, here, and while that article was written specifically pertaining to Ancestry’s SideView, the technique is relevant for all the vendors who attempt to divide your DNA into parents, a technique known as phasing.

I say “attempt” because phasing may or may not be accurate, meaning the top chromosome may not always be parent 1, and the bottom chromosome may not always be chromosome 2.

Here’s an example at 23andMe.

See the two yellow segments. They are both assigned as Native. I happen to know one is from the mother and one is from the father, yet they are both displayed on the “top” chromosome, which one would interpret to be the same parent.

I am absolutely positive this is not the case because this is a close family member, and I have the DNA of the parent who contributed the Native segment on chromosome 1, on the top chromosome. That parent does not have a Native segment on chromosome 2 to contribute. So that Native segment had to be contributed by the other parent, but it’s also shown on the top chromosome.

The DNA segments circled in purple belong together on the same “side” and were contributed to the tester by the same parent. The Native segment on chromosome 2 abuts a purple African segment, suggesting perhaps that the ancestor who contributed that segment was mixed between those ethnicities. In the US, that suggests enslavement.

The other African segments, circled, are shown on the second chromosome in each pair.

To be clear, parent 1 is not assigned by the vendors to either mother or father and will differ by person. Your parent 1, or the parent on the top chromosome may be your mother and another person’s parent 1 may be their father.

As shown in this example, parents can vary by chromosome, a phenomenon known as “strand swap.” Occasionally, the DNA can even be swapped within a chromosome assignment.

You can, however, get an idea of the division of your DNA at any specific location. As shown above, you can only have a maximum of two populations of DNA on any one chromosome location.

In our example above, this person’s majority ancestry is European (blue.) On each chromosome where we find a minority segment, the opposite chromosome in the same location is European, meaning blue.

Let’s look at another example.

At FamilyTreeDNA, the person whose ethnicity painting is shown below has a Native American (pink) ancestor on their father’s side. FamilyTreeDNA has correctly phased or identified their Native segments as all belonging to the second chromosome in each pair.

Looking at chromosome 18, for example, most of their father’s chromosome is Native American (pink). The other parent’s chromosome is European (dark blue) at those same locations.

If one of the parents was of one ethnicity, and the other parent is a completely different ethnicity, then one bar of each chromosome would be all pink, for example, and one would be entirely blue, representing the other ethnicity.

Phasing ethnicity or populations to maternal and paternal sides is not foolproof, and each chromosome is phased individually.

Ethnicity can, in some cases, give you a really good idea of what you’re dealing with in terms of heritage and endogamy.

If someone had an Ashkenazi Jewish father and European mother, for example, one copy of each chromosome would be yellow (Ashkenazi Jewish), and one would be blue (European.)

However, if each of their parents were half European Jewish and half European (not Jewish), then their different colored segments would be scattered across their entire set of chromosomes.

In this case, both of the tester’s parents are mixed – European Jewish (green) and Western Europe (blue.) We know both parents are admixed from the same two populations because in some locations, both parents contributed blue (Western Europe), and in other locations, both contributed Jewish (green) segments.

Both MyHeritage and Ancestry provide a secondary tool that’s connected to ethnicity, but different and generally in more recent times.

Ancestry’s DNA Communities

While your ethnicity may not point to anything terribly exciting in terms of endogamy, Genetic Communities might. Ancestry says that a DNA Community is a group of people who share DNA because their relatives recently lived in the same place at the same time, and that communities are much smaller than ethnicity regions and reach back only about 50-300 years.

Based on the ancestors’ locations in the trees of me and my matches, Ancestry has determined that I’m connected to two communities. In my case, the blue group is clearly my father’s line. The orange group could be either parent, or even a combination of both.

My endogamous Brethren could be showing up in Maryland, Pennsylvania, and Ohio, but it’s uncertain, in part, because my father’s ancestral lines are found in Virginia, West Virginia, and Maryland too.

These aren’t useful for me, but they may be more useful for fully endogamous people, especially in conjunction with ethnicity.

My Acadian cousin’s European ethnicity isn’t informative.

However, viewing his DNA Communities puts his French heritage into perspective, especially combined with his match surnames.

I wrote about DNA Communities when it was introduced with the name Genetic Communities, here.

MyHeritage’s Genetic Groups

MyHeritage also provides a similar feature that shows where my matches’ ancestors lived in the same locations as mine.

One difference, though, is that testers can adjust their ethnicity results confidence level from high, above, to low, below where one of my Genetic Groups overlaps my ethnicity in the Netherlands.

You can also sort your matches by Genetic Groups.

The results show you not only who is in the group, but how many of your matches are in that group too, which provides perspective.

I wrote about Genetic Groups, here.

Next, let’s look at how endogamy affects your matches.

Matches

The number of matches that a person has who is from an entirely endogamous community and a person with no endogamy may be quite different.

FamilyTreeDNA provides a Family Matching feature that triangulates your matches and assigns them to your paternal or maternal side by using known matches that you have linked to their profile cards in your tree. You must link people for the Family Matching feature known as “bucketing” to be enabled.

The people you link are then processed for shared matches on the same chromosome segment(s). Triangulated individuals are then deposited in your maternal, paternal, and both buckets.

Obviously, your two parents are the best people to link, but if they haven’t tested (or uploaded their DNA file from another vendor) and you have other known relatives, link them using the Family Tree tab at the top of your personal page.

I uploaded my Ancestry V4 kit to use as an example for linking. Let’s pretend that’s my sister. If I had not already linked my Ancestry V4 kit to “my sister’s” profile card, I’d want to do that and link other known individuals the same way. Just drag and drop the match to the correct profile card.

Note that a full or half sibling will be listed as such at FamilyTreeDNA, but an identical twin will show as a potential parent/child match to you. You’re much more likely to find a parent than an identical twin, but just be aware.

I’ve created a table of FamilyTreeDNA bucketed match results, by category, comparing the number of matches in endogamous categories with non-endogamous.

Total Matches Maternal Matches Paternal Matches Both % Both % DNA Unassigned
100% Jewish 34,637 11,329 10,416 4,806 13.9 23.3
100% Jewish 32,973 10,700 9,858 4,606 14 23.7
100% Jewish 32,255 9,060 10,970 3,892 12 25.8
75% Jewish 24,232 11,846 Only mother linked Only mother linked Only mother linked
100% Acadian 8093 3826 2299 1062 13 11
100% Acadian 7828 3763 1825 923 11.8 17
Not Endogamous 6760 3845 1909 13 0.19 14.5
Not Endogamous 7723 1470 3317 6 0.08 38
100% Native American 1,115 Unlinked Unlinked Unlinked
100% Native American 885 290 Unknown Can’t calculate without at least one link on both sides

The 100% Jewish, Acadian, and Not Endogamous testers both have linked their parents, so their matches, if valid (meaning not identical by chance, which I discussed here,) will match them plus one or the other parent.

One person is 75% Jewish and has only linked their Jewish mother.

The Native people have not tested their parents, and the first Native person has not linked anyone in their tree. The second Native person has only linked a few maternal matches, but their mother has not tested. They are seeking their father.

It’s very difficult to find people who are fully Native as testers. Furthermore, Native people are under-sampled. If anyone knows of fully Native (or other endogamous) people who have tested and linked their parents or known relatives in their trees, and will allow me to use their total match numbers anonymously, please let me know.

As you can see, Jewish, Acadian, and Native people are 100% endogamous, but many more Jewish people than Native people have tested, so you CAN’T judge endogamy by the total number of matches alone.

In fact, in order:

  • Fully Jewish testers have about 4-5 times as many matches as the Acadian and Non-endogamous testers
  • Acadian and Non-endogamous testers have about 5-6 times as many matches as the Native American testers
  • Fully Jewish people have about 30 times more matches than the Native American testers

If a person’s endogamy with a particular population is only on their maternal or paternal side, they won’t have a significant number of people related to both sides, meaning few people will fall into the “Both” bucket. People that will always be found in the ”Both” bucket are full siblings and their descendants, along with descendants of the tester, assuming their match is linked to their profiles in the tester’s tree.

In the case of our Jewish testers, you can easily see that the “Both” bucket is very high. The Acadians are also higher than one would reasonably expect without endogamy. A non-endogamous person might have a few matches on both sides, assuming the parents are not related to each other.

A high number of “Both” matches is a very good indicator of endogamy within the same population on both parents’ sides.

The percentage of people who are assigned to the “Both” bucket is between 11% and 14% in the endogamous groups, and less than 1% in the non-endogamous group, so statistically not relevant.

As demonstrated by the Native people compared to the Jewish testers, the total number of matches can be deceiving.

However, being related to both parents, as indicated by the “Both” bucket, unless you have pedigree collapse, is a good indicator of endogamy.

Of course, if you don’t know who your relatives are, you can’t link them in your tree, so this type of “hunt” won’t generally help people seeking their close family members.

However, you may notice that you’re matching people PLUS both of their parents. If that’s the case, start asking questions of those matches about their heritage.

A very high number of total matches, as compared to non-endogamous people, combined with some other hints might well point to Jewish heritage.

I included the % DNA Unassigned category because this category, when both parents are linked, is the percentage of matches by chance, meaning the match doesn’t match either of the tester’s parents. All of the people with people listed in “Both” categories have linked both of their parents, not just maternal and paternal relatives.

Matching Location at MyHeritage

MyHeritage provides a matching function by location. Please note that it’s the location of the tester, but that may still be quite useful.

The locations are shown in the most-matches to least-matches order. Clicking on the location shows the people who match you who are from that location. This would be the most useful in situations where recent immigration has occurred. In my case, my great-grandfather from the Netherlands arrived in the 1860s, and my German ancestors arrived in the 1850s. Neither of those groups are endogamous, though, unless it would be on a village level.

AutoClusters

Let’s shift to Genetic Affairs, a third-party tool available to everyone.

Using their AutoCluster function, Genetic Affairs clusters your matches together who match both each other and you.

This is an example of the first few clusters in my AutoCluster. You can see that I have several colored clusters of various sizes, but none are huge.

Compare that to the following endogamous cluster, sample courtesy of EJ Blom at Genetic Affairs.

If your AutoCluster at Genetic Affairs looks something like this, a huge orange blob in the upper left hand corner, you’re dealing with endogamy.

Please also note that the size of your cluster is also a function of both the number of testers and the match threshold you select. I always begin by using the defaults. I wrote about using Genetic Affairs, here.

If you tested at or transferred to MyHeritage, they too license AutoClusters, but have optimized the algorithm to tease out endogamous matches so that their Jewish customers, in particular, don’t wind up with a huge orange block of interrelated people.

You won’t see the “endogamy signature” huge cluster in the corner, so you’re less likely to be able to discern endogamy from a MyHeritage cluster alone.

The commonality between these Jewish clusters at MyHeritage is that they all tend to be rather uniform in size and small, with lots of grey connecting almost all the blocks.

Grey cells indicate people who match people in two colored groups. In other words, there is often no clear division in clusters between the mother’s side and the father’s side in Jewish clusters.

In non-endogamous situations, even if you can’t identify the parents, the clusters should still fall into two sides, meaning a group of clusters for each parent’s side that are not related to each other.

You can read more about Genetic Affairs clusters and their tools, here. DNAGedcom.com also provides a clustering tool.

Endogamous Relationships

Endogamous estimated relationships are sometimes high. Please note the word, “sometimes.”

Using the Shared cM Project tool relationship chart, here, at DNAPainter, people with heavy endogamy will discover that estimated relationships MAY be on the high side, or the relationships may, perhaps, be estimated too “close” in time. That’s especially true for more distant relationships, but surprisingly, it’s not always true. The randomness of inheritance still comes into play, and so do potential unknown relatives. Hence, the words “may” are bolded and underscored.

Unfortunately, it’s often stated as “conventional wisdom” that Jewish matches are “always” high, and first cousins appear as siblings. Let’s see what the actual data says.

At DNAPainter, you can either enter the amount of shared DNA (cM), or the percent of shared DNA, or just use the chart provided.

I’ve assembled a compilation of close relationships in kits that I have access to or from people who were generous enough to share their results for this article.

I’ve used Jewish results, which is a highly endogamous population, compared with non-endogamous testers.

The “Jewish Actual” column reports the total amount of shared DNA with that person. In other words, someone to their grandparent. The Average Range is the average plus the range from DNAPainter. The Percent Difference is the % difference between the actual number and the DNAPainter average.

You’ll see fully Jewish testers, at left, matching with their family members, and a Non-endogamous person, at right, matching with their same relative.

Relationship Jewish Actual Percent Difference than Average Average -Range Non-endogamous Actual Percent Difference than Average
Grandparent 2141 22 1754 (984-2482) 1742 <1 lower
Grandparent 1902 8.5 1754 (984-2482) 1973 12
Sibling 3039 16 2613 (1613-3488) 2515 3.5 lower
Sibling 2724 4 2613 (1613-3488) 2761 5.5
Half-Sibling 2184 24 1759 (1160-2436) 2127 21
Half-Sibling 2128 21 1759 (1160-2436) 2352 34
Aunt/Uncle 2066 18.5 1741 (1201-2282) 1849 6
Aunt/Uncle 2031 16.5 1741 (1201-2282) 2097 20
1C 1119 29 866 (396-1397) 959 11
1C 909 5 866 (396-1397) 789 9 lower
1C1R 514 19 433 (102-980) 467 8
1C1R 459 6 433 (102-980) 395 9 lower

These totals are from FamilyTreeDNA except one from GEDMatch (one Jewish Half-sibling).

Totals may vary by vendor, even when matching with the same person. 23andMe includes the X segments in the total cMs and also counts fully identical segments twice. MyHeritage imputation seems to err on the generous side.

However, in these dozen examples:

  • You can see that the Jewish actual amount of DNA shared is always more than the average in the estimate.
  • The red means the overage is more than 100 cM larger.
  • The percentage difference is probably more meaningful because 100 cM is a smaller percentage of a 1754 grandparent connection than compared to a 433 cM 1C1R.

However, you can’t tell anything about endogamy by just looking at any one sample, because:

  • Some of the Non-Endogamous matches are high too. That’s just the way of random inheritance.
  • All of the actual Jewish match numbers are within the published ranges, but on the high side.

Furthermore, it can get more complex.

Half Endogamous

I requested assistance from Jewish genealogy researchers, and a lovely lady, Sharon, reached out, compiled her segment information, and shared it with me, granting permission to share with you. A HUGE thank you to Sharon!

Sharon is half-Jewish via one parent, and her half-sibling is fully Jewish. Their half-sibling match to each other at Ancestry is 1756 cM with a longest segment of 164 cM.

How does Jewish matching vary if you’re half-Jewish versus fully Jewish? Let’s look at 21 people who match both Sharon and her fully Jewish half-sibling.

Sharon shared the differences in 21 known Jewish matches with her and her half-sibling. I’ve added the Relationship Estimate Range from DNAPainter and colorized the highest of the two matches in yellow. Bolding in the total cM column shows a value above the average range for that relationship.

Total Matching cMs is on the left, with Longest Segment on the right.

While this is clearly not a scientific study, it is a representative sample.

The fully Jewish sibling carries more Jewish DNA, which is available for other Jewish matches to match as a function of endogamy (identical by chance/population), so I would have expected the fully Jewish sibling to match most if not all Jewish testers at a higher level than the half-Jewish sibling.

However, that’s not universally what we see.

The fully Jewish sibling is not always the sibling with the highest number of matches to the other Jewish testers, although the half-Jewish tester has the larger “Longest Segment” more often than not.

Approximately two-thirds of the time (13/21), the fully Jewish person does have a higher total matching cM, but about one-third of the time (8/21), the half-Jewish sibling has a higher matching cM.

About one-fourth of the time (5/21), the fully Jewish sibling has the longest matching segment, and about two-thirds of the time (13/21), the half-Jewish sibling does. In three cases, or about 14% of the time, the longest segment is equal which may indicate that it’s the same segment.

Because of endogamy, Jewish matches are more likely to have:

  • Larger than average total cM for the specific relationship
  • More and smaller matching segments

However, as we have seen, neither of those are definitive, nor always true. Jewish matches and relationships are not always overestimated.

Ancestry and Timber

Please note that Ancestry downweights some matches by removing some segments using their Timber algorithm. Based on my matches and other accounts that I manage, Ancestry does not downweight in the 2-3rd cousin category, which is 90 cM and above, but they do begin downweighting in the 3-4th cousin category, below 90 cM, where my “Extended Family” category begins.

If you’ve tested at Ancestry, you can check for yourself.

By clicking on the amount of DNA you share with your match on your match list at Ancestry, shown above, you will be taken to another page where you will be able to view the unweighted shared DNA with that match, meaning the amount of DNA shared before the downweighting and removal of some segments, shown below.

Given the downweighting, and the information in the spreadsheet provided by Sharon, it doesn’t appear that any of those matches would have been in a category to be downweighted.

Therefore, for these and other close matches, Timber wouldn’t be a factor, but would potentially be in more distant matches.

Endogamous Segments

Endogamous matches tend to have smaller and more segments. Small amounts of matching DNA tend to skew the total DNA cM upwards.

How and why does this happen?

Ancestral DNA from further back in time tends to be broken into smaller segments.

Sometimes, especially in endogamous situations, two smaller segments, at one time separated from each other, manage to join back together again and form a match, but the match is only due to ancestral segments – not because of a recent ancestor.

Please note that different vendors have different minimum matching cM thresholds, so smaller matches may not be available at all vendors. Remember that factors like Timber and imputation can affect matching as well.

Let’s take a look at an example. I’ve created a chart where two ancestors have their blue and pink DNA broken into 4 cM segments.

They have children, a blue child and a pink child, and the two children, shown above, each inherited the same blue 4 cM segment and the same pink 4 cM segment from their respective parents. The other unlabeled pink and blue segments are not inherited by these two children, so those unlabeled segments are irrelevant in this example.

The parents may have had other children who inherited those same 4 cM labeled pink and blue segments as well, and if not, the parents’ siblings were probably passing at least some of the same DNA down to their descendants too.

The blue and pink children had children, and their children had children – for several generations.

Time passed, and their descendants became an endogamous community. Those pink and blue 4 cM segments may at some time be lost during recombination in the descendants of each of their children, shown by “Lost pink” and “Lost blue.”

However, because there is only a very limited amount of DNA within the endogamous community, their descendants may regain those same segments again from their “other parent” during recombination, downstream.

In each generation, the DNA of the descendant carrying the original blue or pink DNA segment is recombined with their partner. Given that the partners are both members of the same endogamous community, the two people may have the same pink and/or blue DNA segments. If one parent doesn’t carry the pink 4 cM segment, for example, their offspring may receive that ancestral pink segment from the other parent.

They could potentially, and sometimes do, receive that ancestral segment from both parents.

In our example, the descendants of the blue child, at left, lost the pink 4 cM segment in generation 3, but a few generations later, in generation 11, that descendant child inherited that same pink 4 cM segment from their other parent. Therefore, both the 4 cM blue and 4 cM pink segments are now available to be inherited by the descendants in that line. I’ve shown the opposite scenario in the generational inheritance at right where the blue segment is lost and regained.

Once rejoined, that pink and blue segment can be passed along together for generations.

The important part, though, is that once those two segments butt up against each other again during recombination, they aren’t just two separate 4 cM segments, but one segment that is 8 cM long – that is now equal to or above the vendors’ matching threshold.

This is why people descended from endogamous populations often have the following matching characteristics:

  • More matches
  • Many smaller segment matches
  • Their total cM is often broken into more, smaller segments

What does more, smaller segments, look like, exactly?

More, Smaller Segments

All of our vendors except Ancestry have a chromosome browser for their customers to compare their DNA to that of their matches visually.

Let’s take a look at some examples of what endogamous and non-endogamous matches look like.

For example, here’s a screen shot of a random Jewish second cousin match – 298 cM total, divided into 12 segments, with a longest segment of 58 cM,

A second Jewish 2C with 323 cM total, across 19 segments, with a 69 cM longest block.

A fully Acadian 2C match with 600 cM total, across 27 segments, with a longest segment of 69 cM.

A second Acadian 2C with 332 cM total, across 20 segments, with a longest segment of 42 cM.

Next, a non-endogamous 2C match with 217 cM, across 7 segments, with a longest segment of 72 cM.

Here’s another non-endogamous 2C example, with 169 shared cM, across 6 segments, with a longest segment of 70 cM.

Here’s the second cousin data in a summary table. The take-away from this is the proportion of total segments

Tester Population Total cM Longest Block Total Segments
Jewish 2C 298 58 12
Jewish 2C 323 69 19
Acadian 2C 600 69 27
Acadian 2C 332 42 20
Non-endogamous 2C 217 72 7
Non-endogamous 2C 169 70 6

You can see more examples and comparisons between Native American, Jewish and non-endogamous DNA individuals in the article, Concepts – Endogamy and DNA Segments.

I suspect that a savvy mathematician could predict endogamy based on longest block and total segment information.

Lara Diamond, a mathematician, who writes at Lara’s Jewnealogy might be up for this challenge. She just published compiled matching and segment information in her Ashkenazic Shared DNA Survey Results for those who are interested. You can also contribute to Laura’s data, here.

Endogamy, Segments, and Distant Relationships

While not relevant to searching for close relatives, heavily endogamous matches 3C and more distant, to quote one of my Jewish friends, “dissolve into a quagmire of endogamy and are exceedingly difficult to unravel.”

In my own Acadian endogamous line, I often simply have to label them “Acadian” because the DNA tracks back to so many ancestors in different lines. In other words, I can’t tell which ancestor the match is actually pointing to because the same DNA segments or segments is/are carried by several ancestors and their descendants due to founder effect.

The difference with the Acadians is that we can actually identify many or most of them, at least at some point in time. As my cousin, Paul LeBlanc, once said, if you’re related to one Acadian, you’re related to all Acadians. Then he proceeded to tell me that he and I are related 137 different ways. My head hurts!

It’s no wonder that endogamy is incredibly difficult beyond the first few generations when it turns into something like multi-colored jello soup.

“Are Your Parents Related?” Tool

There’s another tool that you can utilize to determine if your parents are related to each other.

To determine if your parents are related to each other, you need to know about ROH, or Runs of Homozygosity (ROH).

ROH means that the DNA on both strands or copies of the same chromosome is identical.

For a few locations in a row, ROH can easily happen just by chance, but the longer the segment, the less likely that commonality occurs simply by chance.

The good news is that you don’t need to know the identity of either of your parents. You don’t need either of your parent’s DNA tests – just your own. You’ll need to upload your DNA file to GEDmatch, which is free.

Click on “Are your parents related?”

GEDMatch analyzes your DNA to see if any of your DNA, above a reasonable matching threshold, is identical on both strands, indicating that you inherited the exact same DNA from both of your parents.

A legitimate match, meaning one that’s not by chance, will include many contiguous matching locations, generally a minimum of 500 SNPs or locations in a row. GEDmatch’s minimum threshold for identifying identical ancestral DNA (ROH) is 200 cM.

Here’s my result, including the graphic for the first two chromosomes. Notice the tiny green bars that show identical by chance tiny sliver segments.

I have no significant identical DNA, meaning my parents are not related to each other.

Next, let’s look at an endogamous example where there are small, completely identical segments across a person’s chromosome

This person’s Acadian parents are related to each other, but distantly.

Next, let’s look at a Jewish person’s results.

You’ll notice larger green matching ROH, but not over 200 contiguous SNPs and 7 cM.

GEDMatch reports that this Jewish person’s parents are probably not related within recent generations, but it’s clear that they do share DNA in common.

People whose parents are distantly related have relatively small, scattered matching segments. However, if you’re seeing larger ROH segments that would be large enough to match in a genealogical setting, meaning multiple greater than 7 cM and 500 SNPs,, you may be dealing with a different type of situation where cousins have married in recent generations. The larger the matching segments, generally, the closer in time.

Blogger Kitty Cooper wrote an article, here, about discovering that your parents are related at the first cousin level, and what their GEDMatch “Are Your Parents Related” results look like.

Let’s look for more clues.

Surnames

There MAY be an endogamy clue in the surnames of the people you match.

Viewing surnames is easier if you download your match list, which you can do at every vendor except Ancestry. I’m not referring to the segment data, but the information about your matches themselves.

I provided instructions in the recent article, How to Download Your DNA Match Lists and Segment Files, here.

If you suspect endogamy for any reason, look at your closest matches and see if there is a discernable trend in the surnames, or locations, or any commonality between your matches to each other.

For example, Jewish, Acadian, and Native surnames may be recognizable, as may locations.

You can evaluate in either or both of two ways:

  • The surnames of your closest matches. Closest matches listed first will be your default match order.
  • Your most frequently occurring surnames, minus extremely common names like Smith, Jones, etc., unless they are also in your closest matches. To utilize this type of matching, sort the spreadsheet in surname order and then scan or count the number of people with each surname.

Here are some examples from our testers.

Jewish – Closest surname matches.

  • Roth
  • Weiss
  • Goldman
  • Schonwald
  • Levi
  • Cohen
  • Slavin
  • Goodman
  • Sender
  • Trebatch

Acadian – Closest surname matches.

  • Bergeron
  • Hebert
  • Bergeron
  • Marcum
  • Muise
  • Legere
  • Gaudet
  • Perry
  • Verlander
  • Trombley

Native American – Closest surname matches.

  • Ortega
  • Begay
  • Valentine
  • Hayes
  • Montoya
  • Sun Bear
  • Martin
  • Tsosie
  • Chiquito
  • Yazzie

You may recognize these categories of surnames immediately.

If not, Google is your friend. Eliminate common surnames, then Google for a few together at a time and see what emerges.

The most unusual surnames are likely your best bets.

Projects

Another way to get some idea of what groups people with these surnames might belong to is to enter the surname in the FamilyTreeDNA surname search.

Go to the main FamilyTreeDNA page, but DO NOT sign on.

Scroll down until you see this image.

Type the surname into the search box. You’ll see how many people have tested with that surname, along with projects where project administrators have included that surname indicating that the project may be of interest to at least some people with that surname.

Here’s a portion of the project list for Cohen, a traditional Jewish surname.

These results are for Muise, an Acadian surname.

Clicking through to relevant surname projects, and potentially contacting the volunteer project administrator can go a very long way in helping you gather and sift information. Clearly, they have an interest in this topic.

For example, here’s the Muise surname in the Acadian AmerIndian project. Two great hints here – Acadian heritage and Halifax, Nova Scotia.

Repeat for the balance of surnames on your list to look for commonalities, including locations on the public project pages.

Locations

Some of the vendor match files include location information. Each person on your match list will have the opportunity at the vendor where they tested to include location information in a variety of ways, either for their ancestors or themselves.

Where possible, it’s easiest to sort or scan the download file for this type of information.

Ancestry does not provide or facilitate a match list, but you can still create your own for your closest 20 or 30 matches in a spreadsheet.

MyHeritage provides common surname and ancestral location information for every match. How cool is that!

Y DNA, Mitochondrial DNA, and Endogamy

Haplogroups for both Y and mitochondrial DNA can indicate and sometimes confirm endogamy. In other cases, the haplogroup won’t help, but the matches and their location information just might.

FamilyTreeDNA is the only vendor that provides Y DNA and mitochondrial DNA tests that include highly granular haplogroups along with matches and additional tools.

23andMe provides high-level haplogroups which may or may not be adequate to pinpoint a haplogroup that indicates endogamy.

Of course, only males carry Y DNA that tracks to the direct paternal (surname) line, but everyone carries their mother’s mitochondrial DNA that represents their mother’s mother’s mother’s, or direct matrilineal line.

Some haplogroups are known to be closely associated with particular ethnicities or populations, like Native Americans, Pacific Islanders, and some Jewish people.

Haplogroups reach back in time before genealogy and can give us a sense of community that’s not available by either looking in the mirror or through traditional records.

This Native American man is a member of high-level haplogroup Q-M242. However, some men who carry this haplogroup are not Native, but are of European or Middle Eastern origin.

I entered the haplogroup in the FamilyTreeDNA Discover tool, which I wrote about, here.

Checking the information about this haplogroup reveals that their common ancestor descended from an Asian man about 30,000 years ago.

The migration path in the Americans explains why this person would have an endogamous heritage.

Our tester would receive a much more refined haplogroup if he upgraded to the Big Y test at FamilyTreeDNA, which would remove all doubt.

However, even without additional testing, information about his matches at FamilyTreeDNA may be very illuminating.

The Q-M242 Native man’s Y DNA matches men with more granular haplogroups, shown above, at left. On the Haplogroup Origins report, you can see that these people have all selected the “US (Native American)” country option.

Another useful tool would be to check the public Y haplotree, here, and the public mitochondrial tree here, for self-reported ancestor location information for a specific haplogroup.

Here’s an example of mitochondrial haplogroup A2 and a few subclades on the public mitochondrial tree. You can see that the haplogroup is found in Mexico, the US (Native,) Canada, and many additional Caribbean, South, and Central American countries.

Of course, Y DNA and mitochondrial DNA (mtDNA) tell a laser-focused story of one specific line, each. The great news, if you’re seeking information about your mother or father, the Y is your father’s direct paternal (surname) line, and mitochondrial is your mother’s direct matrilineal line.

Y and mitochondrial DNA results combined with ethnicity, autosomal matching, and the wide range of other tools that open doors, you will be able to reveal a great deal of information about whether you have endogamous heritage or not – and if so, from where.

I’ve provided a resource for stepping through and interpreting your Y DNA results, here, and mitochondrial DNA, here.

Discover for Y DNA Only

If you’re a female, you may feel left out of Y DNA testing and what it can tell you about your heritage. However, there’s a back door.

You can utilize the Y DNA haplogroups of your closest autosomal matches at both FamilyTreeDNA and 23andMe to reveal information

Haplogroup information is available in the download files for both vendors, in addition to the Family Finder table view, below, at FamilyTreeDNA, or on your individual matches profile cards at both 23andMe and FamilyTreeDNA.

You can enter any Y DNA haplogroup in the FamilyTreeDNA Discover tool, here.

You’ll be treated to:

  • Your Haplogroup Story – how many testers have this haplogroup (so far), where the haplogroup is from, and the haplogroup’s age. In this case, the haplogroup was born in the Netherlands about 250 years ago, give or take 200 years. I know that it was 1806 or earlier based on the common ancestor of the men who tested.
  • Country Frequency – heat map of where the haplogroup is found in the world.
  • Notable Connections – famous and infamous (this haplogroup’s closest notable person is Leo Tolstoy).
  • Migration Map – migration path out of Africa and through the rest of the world.
  • Ancient Connections – ancient burials. His closest ancient match is from about 1000 years ago in Ukraine. Their shared ancestor lived about 2000 years ago.
  • Suggested Projects – based on the surname, projects that other matches have joined, and haplogroups.
  • Scientific Details – age estimates, confidence intervals, graphs, and the mutations that define this haplogroup.

I wrote about the Discover tool in the article, FamilyTreeDNA DISCOVER Launches – Including Y DNA Haplogroup Ages.

Endogamy Tools Summary Tables

Endogamy is a tough nut sometimes, especially if you’re starting from scratch. In order to make this topic a bit easier and to create a reference tool for you, I’ve created three summary tables.

  • Various endogamy-related tools available at each vendor which will or may assist with evaluating endogamy
  • Tools and their ability to detect endogamy in different groups
  • Tools best suited to assist people seeking information about unknown parents or grandparents

Summary of Endogamy Tools by Vendor

Please note that GEDMatch is not a DNA testing vendor, but they accept uploads and do have some tools that the testing vendors do not.

 Tool 23andMe Ancestry FamilyTreeDNA MyHeritage GEDMatch
Ethnicity Yes Yes Yes Yes Use the vendors
Ethnicity Painting Yes + segments Yes, limited Yes + segments Yes
Ethnicity Phasing Yes Partial Yes No
DNA Communities No Yes No No
Genetic Groups No No No Yes
Family Matching aka Bucketing No No Yes No
Chromosome Browser Yes No Yes Yes Yes
AutoClusters Through Genetic Affairs No Through Genetic Affairs Yes, included Yes, with subscription
Match List Download Yes, restricted # of matches No Yes Yes Yes
Projects No No Yes No
Y DNA High-level haplogroup only No Yes, full haplogroup with Big Y, matching, tools, Discover No
Mitochondrial DNA High-level haplogroup only No Yes, full haplogroup with mtFull, matching, tools No
Public Y Tree No No Yes No
Public Mito Tree No No Yes No
Discover Y DNA – public No No Yes No
ROH No No No No Yes

Summary of Endogamous Populations Identified by Each Tool

The following chart provides a guideline for which tools are useful for the following types of endogamous groups. Bolded tools require that both parents be descended from the same endogamous group, but several other tools give more definitive results with higher amounts of endogamy.

Y and mitochondrial DNA testing are not affected by admixture, autosomal DNA or anything from the “other” parent.

Tool Jewish Acadian Anabaptist Native Other/General
Ethnicity Yes No No Yes Pacific Islander
Ethnicity Painting Yes No No Yes Pacific Islander
Ethnicity Phasing Yes, if different No No Yes, if different Pacific Islander, if different
DNA Communities Yes Possibly Possibly Yes Pacific Islander
Genetic Groups Yes Possibly Possibly Yes Pacific Islander
Family Matching aka Bucketing Yes Yes Possibly Yes Pacific Islander
Chromosome Browser Possibly Possibly Yes, once segments or ancestors identified Possibly Pacific Islander, possibly
Total Matches Yes, compared to non-endogamous No No No No, unknown
AutoClusters Yes Yes Uncertain, probably Yes Pacific Islander
Estimated Relationships High Not always Sometimes No Sometimes Uncertain, probably
Relationship Range High Possibly, sometimes Possibly Possibly Possibly Pacific Islander, possibly
More, Smaller Segments Yes Yes Probably Yes Pacific Islander, probably
Parents Related Some but minimal Possibly Uncertain Probably similar to Jewish Uncertain, Possibly
Surnames Probably Probably Probably Not Possibly Possibly
Locations Possibly Probably Probably Not Probably Probably Pacific Islander
Projects Probably Probably Possibly Possibly Probably Pacific Islander
Y DNA Yes, often Yes, often No Yes Pacific Islander
Mitochondrial DNA Yes, often Sometimes No Yes Pacific Islander
Y public tree Probably not alone No No Yes Pacific Islander
MtDNA public tree Probably not No No Yes Pacific Islander
Y DNA Discover Yes Possibly Probably not, maybe projects Yes Pacific Islander

Summary of Endogamy Tools to Assist People Seeking Unknown Parents and Grandparents

This table provides a summary of when each of the various tools can be useful to:

  • People seeking unknown close relatives
  • People who already know who their close relatives are, but are seeking additional information or clues about their genealogy

I considered rating these on a 1 to 10 scale, but the relative usefulness of these tools is dependent on many factors, so different tools will be more or less useful to different people.

For example, ethnicity is very useful if someone is admixed from different populations, or even 100% of a specific endogamous population. It’s less useful if the tester is 100% European, regardless of whether they are seeking close relatives or not. Conversely, even “vanilla” ethnicity can be used to rule out majority or recent admixture with many populations.

Tools Unknown Close Relative Seekers Known Close Relatives – Enhance Genealogy
Ethnicity Yes, to identify or rule out populations Yes
Ethnicity Painting Yes, possibly, depending on population Yes, possibly, depending on population
Ethnicity Phasing Yes, possibly, depending on population Yes, possibly, depending on population
DNA Communities Yes, possibly, depending on population Yes, possibly, depending on population
Genetic Groups Possibly, depending on population Possibly, depending on population
Family Matching aka Bucketing Not if parents are entirely unknown, but yes if one parent is known Yes
Chromosome Browser Unlikely Yes
AutoClusters Yes Yes, especially at MyHeritage if Jewish
Estimated Relationships High Not No
Relationship Range High Not reliably No
More, Smaller Segments Unlikely Unlikely other than confirmation
Match List Download Yes Yes
Surnames Yes Yes
Locations Yes Yes
Projects Yes Yes
Y DNA Yes, males only, direct paternal line, identifies surname lineage Yes, males only, direct paternal line, identifies and correctly places surname lineage
Mitochondrial DNA Yes, both sexes, direct matrilineal line only Yes, both sexes, direct matrilineal line only
Public Y Tree Yes for locations Yes for locations
Public Mito Tree Yes for locations Yes for locations
Discover Y DNA Yes, for heritage information Yes, for heritage information
Parents Related – ROH Possibly Less useful

Acknowledgments

A HUGE thank you to several people who contributed images and information in order to provide accurate and expanded information on the topic of endogamy. Many did not want to be mentioned by name, but you know who you are!!!

If you have information to add, please post in the comments.

_____________________________________________________________

Follow DNAexplain on Facebook, here or follow me on Twitter, here.

Share the Love!

You’re always welcome to forward articles or links to friends and share on social media.

If you haven’t already subscribed (it’s free,) you can receive an email whenever I publish by clicking the “follow” button on the main blog page, here.

You Can Help Keep This Blog Free

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Uploads

Genealogy Products and Services

My Book

Genealogy Books

Genealogy Research

FamilyTreeDNA Relaunch – New Feature Overview

The brand-new FamilyTreeDNA website is live!

I’m very pleased with the investment that FamilyTreeDNA has made in their genealogy platform and tools. This isn’t just a redesign, it’s more of a relaunch.

I spoke with Dr. Lior Rauchberger, CEO of myDNA, the parent company of FamilyTreeDNA briefly yesterday. He’s excited too and said:

“The new features and enhancements we are releasing in July are the first round of updates in our exciting product roadmap. FamilyTreeDNA will continue to invest heavily in the advancement of genetic genealogy.”

In other words, this is just the beginning.

In case you were wondering, all those features everyone asked for – Lior listened.

Lior said earlier in 2021 that he was going to do exactly this and he’s proven true to his word, with this release coming just half a year after he took the helm. Obviously, he hit the ground running.

A few months ago, Lior said that his initial FamilyTreeDNA focus was going to be on infrastructure, stability, and focusing on the customer experience. In other words, creating a foundation to build on.

The new features, improvements, and changes are massive and certainly welcome.

I’ll be covering the new features in a series of articles, but in this introductory article, I’m providing an overview so you can use it as a guide to understand and navigate this new release.

Change is Challenging

I need to say something here.

Change is hard. In fact, change is the most difficult challenge for humans. We want improvements, yet we hate it when the furniture is rearranged in our “room.” However, we can’t have one without the other.

So, take a deep breath, and let’s view this as a great new adventure. These changes and tools will provide us with a new foundation and new clues. Think of this as finding long-lost documents in an archive about your ancestors. If someone told me that there is a potential for discovering the surname of one of my elusive female ancestors in an undiscovered chest in a remote library, trust me, I’d be all over it – regardless of where it was or how much effort I had to expend to get there. In this case, I can sit right here in front of my computer and dig for treasure.

We just need to learn to navigate the new landscape in a virtual room. What a gift!

Let’s start with the first thing you’ll see – the main page when you sign in.

Redesigned Main Page

The FamilyTreeDNA main page has changed. To begin with, the text is darker and the font is larger across the entire platform. OMG, thank you!!!

The main page has been flipped left to right, with results on the left now. Projects, surveys, and other information, along with haplogroup badges are on the right. Have you answered any surveys? I don’t think I even noticed them before. (My bad!)

Click any image to enlarge.

The top tabs have changed too. The words myTree and myProjects are now gone, and descriptive tabs have replaced those. The only “my” thing remaining is myOrigins. This change surprises me with myDNA being the owner.

The Results & Tools tab at the top shows the product dropdowns.

The most popular tabs are shown individually under each product, with additional features being grouped under “See More.”

Every product now has a “See More” link where less frequently used widgets will be found, including the raw data downloads. This is the Y DNA “See More” dropdown by way of example.

You can see the green Updated badge on the Family Finder Matches tab. I don’t know if that badge will always appear when customers have new matches, or if it’s signaling that all customers have updated Family Finder Matches now.

We’ll talk about matches in the Family Finder section.

The Family Finder “See More” tab includes the Matrix, ancientOrigins, and the raw data file download.

The mitochondrial DNA section, titled Maternal Line Ancestry, mtDNA Results and Tools includes several widgets grouped under the “See More” tab.

Additional Tests and Tools

The Additional Tests and Tools area includes a link to your Family Tree (please do upload or create one,) Public Haplotrees, and Advanced Matches.

Public haplotrees are free-to-the-public Y and mitochondrial DNA trees that include locations. They are also easily available to FamilyTreeDNA customers here.

Please note that you access both types of trees from one location after clicking the Public Haplotrees page. The tree defaults to Y-DNA, but just click on mtDNA to view mitochondrial haplogroups and locations. Both trees are great resources because they show the location flags of the earliest known ancestors of the testers within each haplogroup.

Advanced Matches used to be available from the menu within each test type, but since advanced matching includes all three types of tests, it’s now located under the Additional Tests and Tools banner. Don’t forget about Advanced Matches – it’s really quite useful to determine if someone matches you on multiple types of tests and/or within specific projects.

Hey, look – I found a tooltip. Just mouse over the text and tabs on various pages to see where tooltips have been added.

Help and Help Center

The new Help Center is debuting in this release. The former Learning Center is transitioning to the Help Center with new, updated content.

Here’s an example of the new easy-to-navigate format. There’s a search function too.

Each individual page, test type, and section on your personal home page has a “Helpful Information” button.

On the main page, at the top right, you’ll see a new Help button.

Did you see that Submit Feedback link?

If you click on the Help Center, you’ll be greeted with context-sensitive help.

I clicked through from the dashboard, so that’s what I’m seeing. However, other available topics are shown at left.

I clicked on both of the links shown and the content has been updated with the new layout and features. No wonder they launched a new Help Center!

Account Settings

Account settings are still found in the same place, and those pages don’t appear to have changed. However, please keep in mind that some settings make take up to 24 hours to take effect.

Family Finder Rematching

Before we look at what has changed on your Family Finder pages, let’s talk about what happened behind the scenes.

FamilyTreeDNA has been offering the Family Finder test for 11 years, one of two very early companies to enter that marketspace. We’ve learned so much since then, not only about DNA itself, but about genetic genealogy, matching, triangulation, population genetics, how to use these tools, and more.

In order to make improvements, FamilyTreeDNA changing the match criteria which necessitated rematching everyone to everyone else.

If you have a technology background of any type, you’ll immediately realize that this is a massive, expensive undertaking requiring vast computational resources. Not only that, but the rematching has to be done in tandem with new kits coming in, coordinated for all customers, and rolled out at once. Based on new matches and features, the user interface needed to be changed too, at the same time.

Sounds like a huge headache, right?

Why would a company ever decide to undertake that, especially when there is no revenue for doing so? The answer is to make functionality and accuracy better for their customers. Think of this as a new bedrock foundation for the future.

FamilyTreeDNA has made computational changes and implemented several features that require rematching:

  • Improved matching accuracy, in particular for people in highly endogamous populations. People in this category have thousands of matches that occur simply because they share multiple distant ancestors from within the same population. That combination of multiple common ancestors makes their current match relationships appear to be closer in time than they are. In order to change matching algorithms, FamilyTreeDNA had to rewrite their matching software and then run matching all over to enable everyone to receive new, updated match results.
  • FamilyTreeDNA has removed segments below 6 cM following sustained feedback from the genealogical community.
  • X matching has changed as well and no longer includes anyone as an X match below 6 cM.
  • Family Matching, meaning paternal, maternal and both “bucketing” uses triangulation behind the scenes. That code also had to be updated.
  • Older transfer kits used to receive only closer matches because imputation was not in place when the original transfer/upload took place. All older kits have been imputed now and matched with the entire database, which is part of why you may have more matches.
  • Relationship range calculations have changed, based on the removal of microsegments, new matching methodology and rematching results.
  • FamilyTreeDNA moved to hg37, known as Build 37 of the human genome. In layman’s terms, as scientists learn about our DNA, the human map of DNA changes and shifts slightly. The boundary lines change somewhat. Versions are standardized so all researchers can use the same base map or yardstick. In some cases, early genetic genealogy implementers are penalized because they will eventually have to rematch their entire database when they upgrade to a new build version, while vendors who came to the party later won’t have to bear that internal expense.

As you can see, almost every aspect of matching has changed, so everyone was rematched against the entire database. You’ll see new results. Some matches may be gone, especially distant matches or if you’re a member of an endogamous population.

You’ll likely have new matches due to older transfer kits being imputed to full compatibility. Your matches should be more accurate too, which makes everyone happy.

I understand a white paper is being written that will provide more information about the new matching algorithms.

Ok, now let’s check out the new Family Finder Matches page.

Family Finder Matches

FamilyTreeDNA didn’t just rearrange the furniture – there’s a LOT of new content.

First, a note. You’ll see “Family Finder” in some places, and “Autosomal DNA” in other places. That’s one and the same at FamilyTreeDNA. The Family Finder test is their autosomal test, named separately because they also have Y DNA and mitochondrial DNA tests.

When you click on Family Finder matches for the first time, you will assuredly notice one thing and will probably notice a second.

First, you’ll see a little tour that explains how to use the various new tools.

Secondly, you will probably see the “Generating Matches” notice for a few seconds to a few minutes while your match list is generated, especially if the site is busy because lots of people are signing on. I saw this message for maybe a minute or two before my match list filled.

This should be a slight delay, but with so many people signing in right now, my second kit took longer. If you receive a message that says you have no matches, just refresh your page. If you had matches before, you DO have matches now.

While working with the new interface this morning, I’ve found that refreshing the screen is the key to solving issues.

My kits that have a few thousand matches loaded Family Matching (bucketing) immediately, but this (Jewish) kit that has around 30,000 matches received this informational message instead. FamilyTreeDNA has removed the little spinning icon. If you mouse over the information, you’ll see the following message:

This isn’t a time estimate. Everyone receives the same message. The message didn’t even last long enough for me to get a screenshot on the first kit that received this message. The results completed within a minute or so. The Family Matching buckets will load as soon as the parental matching is ready.

These delays should only happen the first time, or if someone has a lot of matches that they haven’t yet viewed. Once you’ve signed in, your matches are cached, a technique that improves performance, so the loading should be speedy, or at least speedier, during the second and subsequent visits.

Of course, right now, all customers have an updated match list, so there’s something new for everyone.

Getting Help

Want to see that tutorial again?

Click on that little Help box in the upper right-hand corner. You can view the Tutorial, look at Quick References that explain what’s on this page, visit the Help Center or Submit Feedback.

Two Family Finder Matches Views – Detail and Table

The first thing you’ll notice is that there are two views – Detail View and Table View. The default is Detail View.

Take a minute to get used to the new page.

Detail View – Filter Matches by Match Type

I was pleased to see new filter buttons, located in several places on the page.

The Matches filter at left allows you to display only specific relationship levels, including X-Matches which can be important in narrowing matches to a specific subset of ancestors.

You can display only matches that fall within certain relationship ranges. Note the new “Remote Relative” that was previously called speculative.

Parental Matching and Filtering by Test Type or Trees

All of your matches are displayed by default, of course, but you can click on Paternal, Maternal or Both, like before to view only matches in those buckets. In order for the Family Matching bucketing feature to be enabled, you must attach known relatives’ DNA matches to their proper place in your tree.

Please note that I needed to refresh the page a couple of times to get my parental matches to load the first time. I refreshed a couple of times to be sure that all of my bucketed matches loaded. This should be a first-time loading blip.

There’s a new filter button to the right of the bucketing tabs.

You can now filter by who has trees and who has taken which kinds of tests.

You can apply multiple filters at the same time to further narrow your matches.

Important – Clearing Filters

It’s easy to forget you have a filter enabled. This section is important, in part because Clear Filter is difficult to find.

The clear filter button does NOT appear until you’ve selected a filter. However, after applying that filter, to clear it and RESET THE MATCHES to unfiltered, you need to click on the “Clear Filter” button which is located at the top of the filter selections, and then click “Apply” at the bottom of the menu. I looked for “clear filter” forever before finding it here.

You’re welcome😊

Enhanced Search

Thank goodness, the search functionality has been enhanced and simplified too. Full name search works, both here and on the Y DNA search page.

If you type in a surname without selecting any search filters, you’ll receive a list of anyone with that word in their name, or in their list of ancestral surnames. This does NOT include surnames in their tree if they have not added those surnames to their list of ancestral surnames.

Notice that your number of total matches and bucketed people will change based on the results of this search and any filters you have applied.

I entered Estes in the search box, with no filters. You can see that I have a total of 46 matches that contain Estes in one way or another, and how they are bucketed.

Estes is my birth surname. I noticed that three people with Estes in their information are bucketed maternally. This is the perfect example of why you can’t assume a genetic relationship based on only a surname. Those three people’s DNA matches me on my mother’s side. And yes, I confirmed that they matched my mother too on that same segment or segments.

Search Filters

You can also filter by haplogroup. This is very specific. If you select mitochondrial haplogroup J, you will only receive Family Finder matches that have haplogroup J, NOT J1 or J1c or J plus anything.

If you’re looking for your own haplogroup, you’ll need to type your full haplogroup in the search box and select mtDNA Haplogroup in the search filter dropdown.

Resetting Search Results

To dismiss search results, click on the little X. It’s easy to forget that you have initiated a search, so I need to remember to dismiss searches after I’m finished with each one.

Export Matches

The “Export CSV” button either downloads your entire match list, or the list of filtered matches currently selected. This is not your segment information, but a list of matches and related information such as which side they are bucketed on, if any, notes you’ve made, and more.

Your segment information is available for download on the chromosome browser.

Sort By

The Sort By button facilitates sorting your matches versus filtering your matches. Filters ONLY display the items requested, while sorts display all of the items requested, sorting them in a particular manner.

You can sort in any number of ways. The default is Relationship Range followed by Shared DNA.

Your Matches – Detail View

A lot has changed, but after you get used to the new interface, it makes more sense and there are a lot more options available which means increased flexibility. Remember, you can click to enlarge any of these images.

To begin with, you can see the haplogroups of your matches if they have taken a Y or mitochondrial DNA test. If you match someone, you’ll see a little check in the haplogroup box. I’m not clear whether this means you’re a haplogroup match or that person is on your match list.

To select people to compare in the chromosome browser, you simply check the little square box to the left of their photo and the chromosome browser box pops up at the bottom of the page. We’ll review the chromosome browser in a minute.

The new Relationship Range prediction is displayed, based on new calculations with segments below 6 cM removed. The linked relationship is displayed below the range.

A linked relationship occurs when you link that person to their proper place in your tree. If you have no linked relationship, you’ll see a link to “assign relationship” which takes you to your tree to link this person if you know how you are related.

The segments below 6 cM are gone from the Shared DNA total and X matches are only shown if they are 6 cM or above.

In Common With and Not In Common With

In Common With and Not In Common With is the little two-person icon at the right.

Just click on the little person icon, then select “In Common With” to view your shared matches between you, that match, and other people. The person you are viewing matches in common with is highlighted at the top of the page, with your common matches below.

You can stack filters now. In this example, I selected my cousin, Don, to see our common matches. I added the search filter of the surname Ferverda, my mother’s maiden name. She is deceased and I manage her kit. You can see that my cousin Don and I have 5 total common matches – four maternal and one both, meaning one person matches me on both my maternal and paternal lines.

It’s great news that now Cousin Don pops up in the chromosome browser box at the bottom, enabling easy confusion-free chromosome segment comparisons directly from the In Common With match page. I love this!!!.

All I have to do now is click on other people and then on Compare Relationship which pushes these matches through to the chromosome browser. This is SOOOO convenient.

You’ll see a new tree icon at right on each match. A dark tree means there’s content and a light tree means this person does not have a tree. Remember, you can filter by trees with content using the filter button beside “Both”.

Your notes are shown at far right. Any person with a note is dark grey and no note is white.

If you’re looking for the email contact information, click on your match’s name to view their placard which also includes more detailed ancestral surname information.

Family Finder – Table View

The table view is very similar to the Detail View. The layout is a bit different with more matches visible in the same space.

This view has lots of tooltips on the column heading bar! Tooltips are great for everyone, but especially for people just beginning to find their way in the genetic genealogy world.

I’ll have to experiment a bit to figure out which view I prefer. I’d like to be able to set my own default for whichever view I want as my default. In fact, I think I’ll submit that in the “Submit Feedback” link. For every suggestion, I’m going to find something really positive to say. This was an immense overhaul.

Chromosome Browser

Let’s look at the chromosome Browser.

You can arrive at the Chromosome Browser by selecting people on your match page, or by selecting the Chromosome Browser under the Results and Tools link.

Everything is pretty much the same on the chromosome browser, except the default view is now 6 cM and the smaller segments are gone. You can also choose to view only segments above 10 cM.

If you have people selected in the chromosome browser and click on Download Segments in the upper right-hand corner, it downloads the segments of only the people currently selected.

You can “Clear All” and then click on Download All Segments which downloads your entire segment file. To download all segments, you need to have no people selected for comparison.

The contents of this file are greatly reduced as it now contains only the segments 6 cM and above.

Family Tree

No, the family tree has not changed, and yes, it needs to, desperately. Trust me, the management team is aware and I suspect one of the improvements, hopefully sooner than later, will be an improved tree experience.

Y DNA

The Y DNA page has received an update too, adding both a Detail View and a Table View with the same basic functionality as the Family Finder matching above. If you are reading this article for Y DNA only, please read the Family Finder section to understand the new layout and features.

Like previously, the match comparison begins at the 111 marker level.

However, there’s a BIG difference. If there are no matches at this level, YOU NEED TO CLICK THE NEXT TAB. You can easily see that this person has matches at the 67 level and below, but the system no longer “counts down” through the various levels until it either finds a level with a match or reaches 12 markers.

If you’re used to the old interface, it’s easy to think you’re at the final destination of 12 markers with no matches when you’re still at 111.

Y DNA Detail View

The Y-DNA Detail and Table views features are the same as Family Finder and are described in that section.

The new format is quite different. One improvement is that the Paternal Country of Origin is now displayed, along with a flag. How cool is that!

The Paternal Earliest Known Ancestor and Match Date are at far right. Note that match dates have been reset to the rerun date. At this point, FamilyTreeDNA is evaluating the possibility of restoring the original match date. Regardless, you’ll be able to filter for match dates when new matches arrive.

Please check to be sure you have your Country of Origin, Earliest Known Ancestor, and mapped location completed and up to date.

Earliest Known Ancestor

If you haven’t completed your Earliest Known Ancestor (EKA) information, now’s the perfect time. It’s easy, so let’s do it before you forget.

Click on the Account Settings gear beneath your name in the right-hand upper corner. Click on Genealogy, then on Earliest Known Ancestors and complete the information in the red boxes.

  • Direct paternal line means your father’s father’s father’s line – as far up through all fathers as you can reach. This is your Y DNA lineage, but females should complete this information on general principles.
  • Direct maternal line means your mother’s mother’s mother’s line – as far up through all mothers that you can reach. This is your mitochondrial DNA lineage, so relevant for both males and females.

Completing all of the information, including the location, will help you and your matches as well when using the Matches Map.

Be sure to click Save when you’re finished.

Y DNA Filters

Y DNA has more filter options than autosomal.

The Y DNA filter, located to the right of the 12 Markers tab allows testers to filter by:

  • Genetic distance, meaning how many mutations difference between you and your matches
  • Groups meaning group projects that the tester has joined
  • Tree status
  • Match date
  • Level of test taken

If none of your matches have taken the 111 marker test or you don’t match anyone at that level, that test won’t show up on your list.

Y DNA Table View

As with Family Finder, the Table View is more condensed and additional features are available on the right side of each match. For details, please review the Family Finder section.

If you’re looking for the old Y DNA TiP report, it’s now at the far right of each match.

The actual calculator hasn’t changed yet. I know people were hoping for the new Y DNA aging in this release, but that’s yet to follow.

Other Pages

Other pages like the Big Y and Mitochondrial DNA did not receive new features or functionality in this release, but do sport new user-friendly tooltips.

I lost track, but I counted over 100 tooltips added across the platform, and this is just the beginning.

There are probably more new features and functionality that I haven’t stumbled across just yet.

And yes, we are going to find a few bugs. That’s inevitable with something this large. Please report anything you find to FamilyTreeDNA.

Oh wait – I almost forgot…

New Videos

I understand that there are in the ballpark of 50 new videos that are being added to the new Help Center, either today or very shortly.

When I find out more, I’ll write an article about what videos are available and where to find them. People learn in various ways. Videos are often requested and will be a popular addition. I considered making videos, but that’s almost impossible for anyone besides the vendor because the names on screens either need to be “fake” or the screen needs to be blurred.

So hurray – very glad to hear these are imminent!

Stay Tuned

Stay tuned for new developments. As Lior said, FamilyTreeDNA is investing heavily in genetic genealogy and there’s more to come.

My Mom used to say that the “proof is in the pudding.” I’d say the myDNA/FamilyTreeDNA leadership team has passed this initial test with flying colors.

Of course, there’s more to do, but I’m definitely grateful for this lovely pudding. Thank you – thank you!

I can’t wait to get started and see what new gems await.

Take a Look!

Sign in and take a look for yourself.

Do you have more matches?

Are your matches more accurate?

How about predicted relationships?

How has this new release affected you?

What do you like the best?

_____________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Products and Services

Books

Genealogy Research

Triangulation in Action at Family Tree DNA

Recently, I published the article, Hitting a Genealogy Home Run Using Your Double-Sided Two-Faced Chromosomes While Avoiding Imposters. The “Home Run” article explains why you want to use a chromosome browser, what you’re seeing and what it means to you.

This article, and the rest in the “Triangulation in Action” series introduces triangulation at Family Tree DNA, MyHeritage, 23andMe, GedMatch and DNAPainter, explaining how to use triangulation to confirm descent from a common ancestor. You may want to read the introductory article first.

What is Triangulation?

Think of triangulation as a three-legged stool – a triangle. Triangulation requires three things:

  1. At least three (not closely related) people must match
  2. On the same reasonably sized segment of DNA and
  3. Descend from a common ancestor

Triangulation is the foundation of confirming descent from a common ancestor, and thereby assigning a specific segment to that ancestor. Without triangulation, you might just have a match to someone else by chance. You can confirm mathematical triangulation, numbers 1 and 2, above, without knowing the identity of the common ancestor.

Boundaries

Triangulation means that all three, or more, people much match on a common segment. However, what you’re likely to see is that some people don’t match on the entire segment, meaning more or less than others as demonstrated in the following examples.

FTDNA Triangulation boundaries.png

You can see that I match 5 different cousins who I know descend from my father’s side on chromosome 15 above. As always, I’m the background grey and these matches are all being compared against me.

I triangulate with them in different ways, forming multiple triangulation groups that I’ve discussed individually, below.

Triangulation Group 1

FTDNA triangulation 1.png

Group 1 – On the left group of matches, above, I triangulate with the blue, red and orange person on the amount of DNA that is common between all of them, shown in the black box. This is triangulation group 1.

I’ve overlayed additional triangulation groups below, so you can compare the groups.

Triangulation Group 2

FTDNA triangulation 2.png

Group 2 – However, if you look just at the blue and orange triangulated matches bracketed in green, I triangulate on slightly more, extending to the left. This group excludes the red person because their beginning point is not the same, or even close. This is triangulation group 2.

Triangulation Group 3 and 4

FTDNA triang 3.png

Group 3 – At right, we see two large triangulation groups. Triangulation group 3 includes the common portions of blue, red, teal and orange matches.

Group 4 – Triangulation group 4 is the skinny group at far right and includes the common portion of the blue, teal and dark blue matches.

Triangulation Groups 5 and 6

FTDNA triang 5.png

Group 5 – There are also two more triangulation groups. The larger green bracketed group includes only the blue and teal people because their end locations are to the right of the end locations of the red and orange matches. The start location varies as well. This is triangulation group 5.

Group 6 – The smaller green bracketed group includes only the blue and teal person because their start locations are before the dark blue person. This is triangulation group 6.

There’s actually one more triangulation group. Can you spot it?

Triangulation Group 7

FTDNA triang 7.png

Group 7 – The tan group includes the red, teal and orange matches but only the areas where they all overlap. This excludes the top blue match because their start location is different. Triangulation group 7 only extends to the end of the red and orange matches, because those are the same locations, while the teal match extends further to the right. That extension is excluded in this group, of course.

Slight Variations

Matches with only slight start and end differences are probably descended from the same ancestor, but we can’t say that for sure (at this point) so we only include actual mathematically matching segments in a triangulation group.

You can see that triangulation groups often overlap because group members share more or less DNA with each other. Normally we don’t bother to number the groups – we just look at the alignment. I numbered them for illustration purposes.

Shared or In-Common-With Matching

Triangulation is not the same thing as a 3-way shared “in-common-with” match. You may share DNA with those two people, but on entirely different segments from entirely different ancestors. If those other two people match each other, it can be on a segment where you don’t match either of them, and thanks to an ancestor that they share who isn’t in your line at all. Shared matches are a great hint, especially in addition to other information such as Phased Family Matching which we’ll talk about in a minute, but shared matches don’t necessarily mean triangulation has occurred, although it’s a great place to start looking.

I have shared matches where I match one person on my maternal side, one on my paternal side, and they match each other through a completely different ancestor on an entirely different segment. However, we don’t triangulate because we don’t all match each other on the SAME segment of DNA. Yes, it can be confusing.

Just remember, each of your segments, and matches, has its own individual history.

Imputation Can Affect Matching

Over the years the chips on which our DNA is processed at the vendors have changed. Each new generation of chips tests a different number of markers, and sometimes different markers – with the overlaps between the entire suite of chips being less than optimal.

I can verify that most vendors use imputation to level the playing field, and even though two vendors have never verified that fact, I’m relatively certain that they all do. That’s the only way they could match to their own prior “only somewhat compatible” chip versions.

The net-net of this is that you may see some differences in matching segments at different vendors, even when you’re comparing the same people. Imputation generally “fills in the blanks,” but doesn’t create large swatches of non-existent DNA. I wrote about the concept of imputation here.

What I’d like for you to take away from this discussion is to be focused on the big picture – if and how people triangulate which is the function important to genealogy. Not if the start and end segments are exactly the same.

Triangulation Solutions

Each of the major vendors, except Ancestry who does not have a chromosome browser, offers some type of triangulation solution, so let’s look at what each vendor offers. If your Ancestry matches have uploaded to GedMatch, Family Tree DNA or MyHeritage, you can triangulate with them there. Otherwise, you can’t triangulate Ancestry results, so encourage your Ancestry matches to transfer.

You can find step-by-step transfer instructions to and from each vendor, here.

I wrote more specifically about triangulation here and here.

Let’s start by looking at triangulation at Family Tree DNA.

Triangulation at Family Tree DNA

Family Tree DNA has two different tools that can be used separately in different circumstances to determine whether or not your segments triangulate.

Phased Family Matching can be used for triangulation.

The Matrix tool can be utilized for people who aren’t designated through Phased Family Matching as maternal or paternal matches to suggest or eliminate triangulation.

First, go to the Family Finder section of your personal page.

We’ll be working with Matches, the Chromosome Browser, and the Matrix.

FTDNA triangulation page.png

Phased Family Matching

At Family Tree DNA, I’ve tested my cousins:

  • Cheryl, my mother’s first cousin (1C)
  • Charlene, my first cousin once removed (1C1R) on my father’s side
  • David, my second cousin (2C) on my father’s side.

I’ve linked the test results of those cousins to my tree in their proper location, which allows Family Tree DNA to do something called Phased Family Matching.

If you don’t have a tree and don’t link your DNA results and those of your family members, Family Tree DNA can’t perform Phased Family Matching.

I explained phasing in the introductory article.

Testing your parents is wonderful if that’s possible, but parents aren’t always available to test. At Family Tree DNA, you don’t need to have tested your parents in order to have phased matches.

In essence, Family Tree DNA uses the DNA of known cousins, third cousins or closer, to assign matches to maternal or paternal tabs, or sides, also sometimes referred to as buckets. I wrote about Phased Family Matching here and here.

FTDNA triang buckets.png

You can see that of my 4806 matches, 1101 are assigned to my paternal side, 884 to my maternal side and 4 are assigned to both.

FTDNA triang header.pngFTDNA triang Charlene.png

My cousin Charlene is assigned to my paternal side, as shown by the blue icon, because I linked her to the correct position in my tree, as is my cousin, David, below.

FTDNA triang David.png

Conversely, my cousin Cheryl is assigned maternally because I linked her as well.

FTDNA triang Cheryl.png

These specific people are assigned maternally and paternally because I linked them to their proper place in my tree. These matches will allows Family Tree DNA to link other testers to the proper side of my tree too, because they match me and my cousin on the same segments – in essence phasing a large number of my matches for me which facilitates triangulation.

Linking Matches on Your Tree

In order to cause Phased Family Matching, aka, “bucketing” to occur, I linked my own test and that of my known 3rd cousins or closer to their proper places in my tree at Family Tree DNA.

If you don’t create a tree or upload a GEDCOM file and link yourself and your known matches, your matches can’t be assigned to maternal and paternal sides.

FTDNA triang tree.png

By utilizing the matching DNA between you and known close relatives on your maternal and paternal sides, Family Tree DNA assigns other people who match both of you on those same segments to the same side of your tree.

If you select matches from the same side of your tree and they match on the same segments, they triangulate.

Of course, that’s assuming the person doesn’t match you on both sides of your tree.

You can also download your matching segments in a file and sort to see who matches on the same locations, but the parental side designation (bucketing) is not reflected in the segment download file. Bucketing is reflected in the match download file which is a different file.

There are two separate download files, but they can be merged.

Two Download Files

The first file, your match download file, provides information about your matches such as their haplogroups, surnames and contact information, including bucketing assignment, but not the actual matching segment data.

The match file tells you a great deal and is both sortable and searchable. You can search for any surname, for example, or you can sort for everyone in the Paternal or Maternal matching bucket. You can creatively combine parts of this file with the matching segments file in order to quickly flag the people on your paternal side. Knowledge about how to work with spreadsheets is a plus.

FTDNA triang match file

Click to enlarge

This download is available at the bottom of the Family Finder match page.

FTDNA triang match.png

You can download all of your matches, or just those in a filtered view, such as in-common-with or as the result of a surname search.

FTDNA triang download.png

The second file, your matching segments file, is available on the chromosome browser page.

The matching segments file includes the match name along with the matching chromosome segments and number of matching SNPs.

FTDNA triang segment file.png

If you click through to the chromosome browser from your main page, as shown below, with NO MATCHES SELECTED, you will be able to download ALL matching segments.

FTDNA triang browser.png

You’ll see “Download All Segments” in the upper right-hand corner.

FTDNA triang download all seg.png

From that Chromosome Browser page, you will also have the ability to select matches to show on the browser.

FTDNA triang browser select

If you select people on the match page before clicking on the chromosome browser or select matches on the chromosome browser page, then clicking on “Download Segments,” will only download the matching segments of the people that you have currently selected to match against in the browser.

FTDNA triang download seg.png

Combinations of Tools and Filters

  • The chromosome browser tells you if people match you on the same segment.
  • The in-common-with filter on the match page tells you who you match in common with a specific person, but not if those two people match each other.

Of course, if both people are assigned to your same parental side bucket, and they both only match you on one large segment – and it’s the same segment, then you must triangulate.

If they aren’t both assigned to a parental bucket, then you can’t make that determination using parental side designations.

Is there a tool that allows you to compare people against each other at the same time to see if your matches also match each other?

Glad you asked.

Yes, there is.

The Matrix

Let’s say that you want to see if a group of people who you match also match each other.

FTDNA triang matrix.png

Family Tree DNA provides a Matrix tool that allows you to select 10 (or fewer) matches in order to determine if your matches also match each other.

FTDNA triang matrix match.png

I’ve entered Cheryl, Charlene and David. You can see that David and Charlene match each other, and Cheryl doesn’t match either Charlene or David.

Of course, we know that’s accurate because:

  • I already know these people and their relationship to me and each other
  • These three people are already assigned to maternal and paternal sides or buckets, so the matrix is verifying what we already know
  • I know where they match on the same segment on the chromosome browser

FTDNA triang 3 browser.png

Even though they match on the same segment on the chromosome browser, the fact that they are bucketed to different parental sides, and that the matrix shows that Cheryl doesn’t match either Charlene and David, confirms that David and Charlene triangulate with me, while Cheryl is not a member of that triangulation group.

This is exactly why triangulation is important. Looking at the image above, the only thing you know is that they all 3 match you – but with the additional information about bucketing and the matrix, we know that only the two bottom people, Charlene and David triangulate with me. Note that I’ve added the maternal and paternal icons for clarity.

FTDNA triang match group browser.png

However, if I didn’t have this knowledge, or not everyone was bucketed, the Matrix tool would be extremely useful. The matrix tool uses the matching threshold of approximately 7.69 cM.

The matrix doesn’t tell you if these people match each other on the same segment where they match you,

However, there’s a good probability that they do, especially if only one matching segment is involved.

You can check the chromosome browser to see if they both match you on the same segment. It’s possible if they don’t match you on the same segment that they match each other on different segments, and possibly through a different ancestor. You may need to reach out to them to ask if they match each other, and if they have known genealogy if they aren’t bucketed.

By utilizing the Matrix tool, you can isolate people to maternal and paternal sides of your tree.

Other Resources to Identify Common Ancestors

Be sure to check other clues at Family Tree DNA such as:

Shared surnames, shown on your matches page, with common surnames that you share bolded

FTDNA triang surnames.png

Trees, indicated by the blue pedigree icon on the match page.

FTDNA triang pedigree.png

Y and mitochondrial DNA haplogroups and matching. You can view your matches haplogroup and other information by clicking on their profile picture on your matches page.

FTDNA triang profile.png

Advanced Matching can be utilized to see if you match on combined tests, or in common projects.

FTDNA triang advanced match.png

This article discusses the 9 different autosomal tools available at Family Tree DNA.

What About You?

Do you have a tree at Family Tree DNA?

Have you connected your test and any family members to your tree?

Can you test a family member, third cousins or closer, or have them transfer a kit from another vendor?

Here’s how to transfer:

How many people do you have on your paternal and maternal tabs on your Family Finder matches page?

You can paint every single one of the people who are designated as maternal or paternal at DNAPainter to your grandparents on the respective maternal or paternal side. DNAPainter Instructions and Resources will explain how, and why.

Join me soon for similar articles about how to work with triangulation at MyHeritage, 23andMe, GedMatch and DNAPainter.

Most of all – have fun!

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

2018 – The Year of the Segment

Looking in the rear view mirror, what a year! Some days it’s been hard to catch your breath things have been moving so fast.

What were the major happenings, how did they affect genetic genealogy and what’s coming in 2019?

The SNiPPY Award

First of all, I’m giving an award this year. The SNiPPY.

Yea, I know it’s kinda hokey, but it’s my way of saying a huge thank you to someone in this field who has made a remarkable contribution and that deserves special recognition.

Who will it be this year?

Drum roll…….

The 2018 SNiPPY goes to…

DNAPainter – The 2018 SNiPPY award goes to DNAPainter, without question. Applause, everyone, applause! And congratulations to Jonny Perl, pictured below at Rootstech!

Jonny Perl created this wonderful, visual tool that allows you to paint your matches with people on your chromosomes, assigning the match to specific ancestors.

I’ve written about how to use the tool  with different vendors results and have discovered many different ways to utilize the painted segments. The DNA Painter User Group is here on Facebook. I use DNAPainter EVERY SINGLE DAY to solve a wide variety of challenges.

What else has happened this year? A lot!

Ancient DNA – Academic research seldom reports on Y and mitochondrial DNA today and is firmly focused on sequencing ancient DNA. Ancient genome sequencing has only recently been developed to a state where at least some remains can be successfully sequenced, but it’s going great guns now. Take a look at Jennifer Raff’s article in Forbes that discusses ancient DNA findings in the Americas, Europe, Southeast Asia and perhaps most surprising, a first generation descendant of a Neanderthal and a Denisovan.

From Early human dispersals within the Americas by Moreno-Mayer et al, Science 07 Dec 2018

Inroads were made into deeper understanding of human migration in the Americas as well in the paper Early human dispersals within the Americas by Moreno-Mayer et al.

I look for 2019 and on into the future to hold many more revelations thanks to ancient DNA sequencing as well as using those sequences to assist in understanding the migration patterns of ancient people that eventually became us.

Barbara Rae-Venter and the Golden State Killer Case

Using techniques that adoptees use to identify their close relatives and eventually, their parents, Barbara Rae-Venter assisted law enforcement with identifying the man, Joseph DeAngelo, accused (not yet convicted) of being the Golden State Killer (GSK).

A very large congratulations to Barbara, a retired patent attorney who is also a genealogist. Nature recognized Ms. Rae-Venter as one of 2018’s 10 People Who Mattered in Science.

DNA in the News

DNA is also represented on the 2018 Nature list by Viviane Slon, a palaeogeneticist who discovered an ancient half Neanderthal, half Denisovan individual and sequenced their DNA and He JianKui, a Chinese scientist who claims to have created a gene-edited baby which has sparked widespread controversy. As of the end of the year, He Jiankui’s research activities have been suspended and he is reportedly sequestered in his apartment, under guard, although the details are far from clear.

In 2013, 23andMe patented the technology for designer babies and I removed my kit from their research program. I was concerned at the time that this technology knife could cut two ways, both for good, eliminating fatal disease-causing mutations and also for ethically questionable practices, such as eugenics. I was told at the time that my fears were unfounded, because that “couldn’t be done.” Well, 5 years later, here we are. I expect the debate about the ethics and eventual regulation of gene-editing will rage globally for years to come.

Elizabeth Warren’s DNA was also in the news when she took a DNA test in response to political challenges. I wrote about what those results meant scientifically, here. This topic became highly volatile and politicized, with everyone seeming to have a very strongly held opinion. Regardless of where you fall on that opinion spectrum (and no, please do not post political comments as they will not be approved), the topic is likely to surface again in 2019 due to the fact that Elizabeth Warren has just today announced her intention to run for President. The good news is that DNA testing will likely be discussed, sparking curiosity in some people, perhaps encouraging them to test. The bad news is that some of the discussion may be unpleasant at best, and incorrect click-bait at worst. We’ve already had a rather unpleasant sampling of this.

Law Enforcement and Genetic Genealogy

The Golden State Killer case sparked widespread controversy about using GedMatch and potentially other genetic genealogy data bases to assist in catching people who have committed violent crimes, such as rape and murder.

GedMatch, the database used for the GSK case has made it very clear in their terms and conditions that DNA matches may be used for both adoptees seeking their families and for other uses, such as law enforcement seeking matches to DNA sequenced during a criminal investigation. Since April 2018, more than 15 cold case investigations have been solved using the same technique and results at GedMatch. Initially some people removed their DNA from GedMatch, but it appears that the overwhelming sentiment, based on uploads, is that people either aren’t concerned or welcome the opportunity for their DNA matches to assist apprehending criminals.

Parabon Nanolabs in May established a genetic genealogy division headed by CeCe Moore who has worked in the adoptee community for the past several years. The division specializes in DNA testing forensic samples and then assisting law enforcement with the associated genetic genealogy.

Currently, GedMatch is the only vendor supporting the use of forensic sample matching. Neither 23anMe nor Ancestry allow uploaded data, and MyHeritage and Family Tree DNA’s terms of service currently preclude this type of use.

MyHeritage

Wow talk about coming onto the DNA world stage with a boom.

MyHeritage went from a somewhat wobbly DNA start about 2 years ago to rolling out a chromosome browser at the end of January and adding important features such as SmartMatching which matches your DNA and your family trees. Add triangulation to this mixture, along with record matching, and you’re got a #1 winning combination.

It was Gilad Japhet, the MyHeritage CEO who at Rootstech who christened 2018 “The Year of the Segment,” and I do believe he was right. Additionally, he announced that MyHeritage partnered with the adoption community by offering 15,000 free kits to adoptees.

In November, MyHeritage hosted MyHeritage LIVE, their first user conference in Oslo, Norway which focused on both their genealogical records offerings as well as DNA. This was a resounding success and I hope MyHeritage will continue to sponsor conferences and invest in DNA. You can test your DNA at MyHeritage or upload your results from other vendors (instructions here). You can follow my journey and the conference in Olso here, here, here, here and here.

GDPR

GDPR caused a lot of misery, and I’m glad the implementation is behind us, but the the ripples will be affecting everyone for years to come.

GDPR, the European Data Protection Regulation which went into effect on May 25,  2018 has been a mixed and confusing bag for genetic genealogy. I think the concept of users being in charge and understanding what is happened with their data, and in this case, their data plus their DNA, is absolutely sound. The requirements however, were created without any consideration to this industry – which is small by comparison to the Googles and Facebooks of the world. However, the Googles and Facebooks of the world along with many larger vendors seem to have skated, at least somewhat.

Other companies shut their doors or restricted their offerings in other ways, such as World Families Network and Oxford Ancestors. Vendors such as Ancestry and Family Tree DNA had to make unpopular changes in how their users interface with their software – in essence making genetic genealogy more difficult without any corresponding positive return. The potential fines, 20 million plus Euro for any company holding data for EU residents made it unwise to ignore the mandates.

In the genetic genealogy space, the shuttering of both YSearch and MitoSearch was heartbreaking, because that was the only location where you could actually compare Y STR and mitochondrial HVR1/2 results. Not everyone uploaded their results, and the sites had not been updated in a number of years, but the closure due to GDPR was still a community loss.

Today, mitoydna.org, a nonprofit comprised of genetic genealogists, is making strides in replacing that lost functionality, plus, hopefully more.

On to more positive events.

Family Tree DNA

In April, Family Tree DNA announced a new version of the Big Y test, the Big Y-500 in which at least 389 additional STR markers are included with the Big Y test, for free. If you’re lucky, you’ll receive between 389 and 439 new markers, depending on how many STR markers above 111 have quality reads. All customers are guaranteed a minimum of 500 STR markers in total. Matching was implemented in December.

These additional STR markers allow genealogists to assemble additional line marker mutations to more granularly identify specific male lineages. In other words, maybe I can finally figure out a line marker mutation that will differentiate my ancestor’s line from other sons of my founding ancestor😊

In June, Family Tree DNA announced that they had named more than 100,000 SNPs which means many haplogroup additions to the Y tree. Then, in September, Family Tree DNA published their Y haplotree, with locations, publicly for all to reference.

I was very pleased to see this development, because Family Tree DNA clearly has the largest Y database in the industry, by far, and now everyone can reap the benefits.

In October, Family Tree DNA published their mitochondrial tree publicly as well, with corresponding haplogroup locations. It’s nice that Family Tree DNA continues to be the science company.

You can test your Y DNA, mitochondrial or autosomal (Family Finder) at Family Tree DNA. They are the only vendor offering full Y and mitochondrial services complete with matching.

2018 Conferences

Of course, there are always the national conferences we’re familiar with, but more and more, online conferences are becoming available, as well as some sessions from the more traditional conferences.

I attended Rootstech in Salt Lake City in February (brrrr), which was lots of fun because I got to meet and visit with so many people including Mags Gaulden, above, who is a WikiTree volunteer and writes at Grandma’s Genes, but as a relatively expensive conference to attend, Rootstech was pretty miserable. Rootstech has reportedly made changes and I hope it’s much better for attendees in 2019. My attendance is very doubtful, although I vacillate back and forth.

On the other hand, the MyHeritage LIVE conference was amazing with both livestreamed and recorded sessions which are now available free here along with many others at Legacy Family Tree Webinars.

Family Tree University held a Virtual DNA Conference in June and those sessions, along with others, are available for subscribers to view.

The Virtual Genealogical Association was formed for those who find it difficult or impossible to participate in local associations. They too are focused on education via webinars.

Genetic Genealogy Ireland continues to provide their yearly conference sessions both livestreamed and recorded for free. These aren’t just for people with Irish genealogy. Everyone can benefit and I enjoy them immensely.

Bottom line, you can sit at home and educate yourself now. Technology is wonderful!

2019 Conferences

In 2019, I’ll be speaking at the National Genealogical Society Family History Conference, Journey of Discovery, in St. Charles, providing the Special Thursday Session titled “DNA: King Arthur’s Mighty Genetic Lightsaber” about how to use DNA to break through brick walls. I’ll also see attendees at Saturday lunch when I’ll be providing a fun session titled “Twists and Turns in the Genetic Road.” This is going to be a great conference with a wonderful lineup of speakers. Hope to see you there.

There may be more speaking engagements at conferences on my 2019 schedule, so stay tuned!

The Leeds Method

In September, Dana Leeds publicized The Leeds Method, another way of grouping your matches that clusters matches in a way that indicates your four grandparents.

I combine the Leeds method with DNAPainter. Great job Dana!

Genetic Affairs

In December, Genetic Affairs introduced an inexpensive subscription reporting and visual clustering methodology, but you can try it for free.

I love this grouping tool. I have already found connections I didn’t know existed previously. I suggest joining the Genetic Affairs User Group on Facebook.

DNAGedcom.com

I wrote an article in January about how to use the DNAGedcom.com client to download the trees of all of your matches and sort to find specific surnames or locations of their ancestors.

However, in December, DNAGedcom.com added another feature with their new DNAGedcom client just released that downloads your match information from all vendors, compiles it and then forms clusters. They have worked with Dana Leeds on this, so it’s a combination of the various methodologies discussed above. I have not worked with the new tool yet, as it has just been released, but Kitty Cooper has and writes about it here.  If you are interested in this approach, I would suggest joining the Facebook DNAGedcom User Group.

Rootsfinder

I have not had a chance to work with Rootsfinder beyond the very basics, but Rootsfinder provides genetic network displays for people that you match, as well as triangulated views. Genetic networks visualizations are great ways to discern patterns. The tool creates match or triangulation groups automatically for you.

Training videos are available at the website and you can join the Rootsfinder DNA Tools group at Facebook.

Chips and Imputation

Illumina, the chip maker that provides the DNA chips that most vendors use to test changed from the OmniExpress to the GSA chip during the past year. Older chips have been available, but won’t be forever.

The newer GSA chip is only partially compatible with the OmniExpress chip, providing limited overlap between the older and the new results. This has forced the vendors to use imputation to equalize the playing field between the chips, so to speak.

This has also caused a significant hardship for GedMatch who is now in the position of trying to match reasonably between many different chips that sometimes overlap minimally. GedMatch introduced Genesis as a sandbox beta version previously, but are now in the process of combining regular GedMatch and Genesis into one. Yes, there are problems and matching challenges. Patience is the key word as the various vendors and GedMatch adapt and improve their required migration to imputation.

DNA Central

In June Blaine Bettinger announced DNACentral, an online monthly or yearly subscription site as well as a monthly newsletter that covers news in the genetic genealogy industry.

Many educators in the industry have created seminars for DNACentral. I just finished recording “Getting the Most out of Y DNA” for Blaine.

Even though I work in this industry, I still subscribed – initially to show support for Blaine, thinking I might not get much out of the newsletter. I’m pleased to say that I was wrong. I enjoy the newsletter and will be watching sessions in the Course Library and the Monthly Webinars soon.

If you or someone you know is looking for “how to” videos for each vendor, DNACentral offers “Now What” courses for Ancestry, MyHeritage, 23andMe, Family Tree DNA and Living DNA in addition to topic specific sessions like the X chromosome, for example.

Social Media

2018 has seen a huge jump in social media usage which is both bad and good. The good news is that many new people are engaged. The bad news is that people often given faulty advice and for new people, it’s very difficult (nigh on impossible) to tell who is credible and who isn’t. I created a Help page for just this reason.

You can help with this issue by recommending subscribing to these three blogs, not just reading an article, to newbies or people seeking answers.

Always feel free to post links to my articles on any social media platform. Share, retweet, whatever it takes to get the words out!

The general genetic genealogy social media group I would recommend if I were to select only one would be Genetic Genealogy Tips and Techniques. It’s quite large but well-managed and remains positive.

I’m a member of many additional groups, several of which are vendor or interest specific.

Genetic Snakeoil

Now the bad news. Everyone had noticed the popularity of DNA testing – including shady characters.

Be careful, very VERY careful who you purchase products from and where you upload your DNA data.

If something is free, and you’re not within a well-known community, then YOU ARE THE PRODUCT. If it sounds too good to be true, it probably is. If it sounds shady or questionable, it’s probably that and more, or less.

If reputable people and vendors tell you that no, they really can’t determine your Native American tribe, for example, no other vendor can either. Just yesterday, a cousin sent me a link to a “tribe” in Canada that will, “for $50, we find one of your aboriginal ancestors and the nation stamps it.” On their list of aboriginal people we find one of my ancestors who, based on mitochondrial DNA tests, is clearly NOT aboriginal. Snake oil comes in lots of flavors with snake oil salesmen looking to prey on other people’s desires.

When considering DNA testing or transfers, make sure you fully understand the terms and conditions, where your DNA is going, who is doing what with it, and your recourse. Yes, read every single word of those terms and conditions. For more about legalities, check out Judy Russell’s blog.

Recommended Vendors

All those DNA tests look yummy-good, but in terms of vendors, I heartily recommend staying within the known credible vendors, as follows (in alphabetical order).

For genetic genealogy for ethnicity AND matching:

  • 23andMe
  • Ancestry
  • Family Tree DNA
  • GedMatch (not a vendor because they don’t test DNA, but a reputable third party)
  • MyHeritage

You can read about Which DNA Test is Best here although I need to update this article to reflect the 2018 additions by MyHeritage.

Understand that both 23andMe and Ancestry will sell your DNA if you consent and if you consent, you will not know who is using your DNA, where, or for what purposes. Neither Family Tree DNA, GedMatch, MyHeritage, Genographic Project, Insitome, Promethease nor LivingDNA sell your DNA.

The next group of vendors offers ethnicity without matching:

  • Genographic Project by National Geographic Society
  • Insitome
  • LivingDNA (currently working on matching, but not released yet)

Health (as a consumer, meaning you receive the results)

Medical (as a contributor, meaning you are contributing your DNA for research)

  • 23andMe
  • Ancestry
  • DNA.Land (not a testing vendor, doesn’t test DNA)

There are a few other niche vendors known for specific things within the genetic genealogy community, many of whom are mentioned in this article, but other than known vendors, buyer beware. If you don’t see them listed or discussed on my blog, there’s probably a reason.

What’s Coming in 2019

Just like we couldn’t have foreseen much of what happened in 2018, we don’t have access to a 2019 crystal ball, but it looks like 2019 is taking off like a rocket. We do know about a few things to look for:

  • MyHeritage is waiting to see if envelope and stamp DNA extractions are successful so that they can be added to their database.
  • www.totheletterDNA.com is extracting (attempting to) and processing DNA from stamps and envelopes for several people in the community. Hopefully they will be successful.
  • LivingDNA has been working on matching since before I met with their representative in October of 2017 in Dublin. They are now in Beta testing for a few individuals, but they have also just changed their DNA processing chip – so how that will affect things and how soon they will have matching ready to roll out the door is unknown.
  • Ancestry did a 2018 ethnicity update, integrating ethnicity more tightly with Genetic Communities, offered genetic traits and made some minor improvements this year, along with adding one questionable feature – showing your matches the location where you live as recorded in your profile. (23andMe subsequently added the same feature.) Ancestry recently said that they are promising exciting new tools for 2019, but somehow I doubt that the chromosome browser that’s been on my Christmas list for years will be forthcoming. Fingers crossed for something new and really useful. In the mean time, we can download our DNA results and upload to MyHeritage, Family Tree DNA and GedMatch for segment matching, as well as utilize Ancestry’s internal matching tools. DNA+tree matching, those green leaf shared ancestor hints, is still their strongest feature.
  • The Family Tree DNA Conference for Project Administrators will be held March 22-24 in Houston this year, and I’m hopeful that they will have new tools and announcements at that event. I’m looking forward to seeing many old friends in Houston in March.

Here’s what I know for sure about 2019 – it’s going to be an amazing year. We as a community and also as individual genealogists will be making incredible discoveries and moving the ball forward. I can hardly wait to see what quandaries I’ve solved a year from now.

What mysteries do you want to unravel?

I’d like to offer a big thank you to everyone who made 2018 wonderful and a big toast to finding lots of new ancestors and breaking down those brick walls in 2019.

Happy New Year!!!

______________________________________________________________

Disclosure

I receive a small contribution when you click on some (but not all) of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

MyHeritage LIVE Conference Day 2 – The Science Behind DNA Matching    

The MyHeritage LIVE Oslo conference is but a fond memory now, and I would count it as a resounding success.

Perhaps one of the reasons I enjoyed it so much is the scientific aspect and because the content is very focused on a topic I enjoy without being the size and complexity of Rootstech. The smaller, more intimate venue also provides access to the “right” people as well as the ability to meet other attendees and not be overwhelmed by the sheer size.

Here are some stats:

  • 401 registered guests
  • 28 countries represented including distant places like Australia and South America
  • More than 20 speakers plus the hands-on workshops where specialist teams worked with students
  • 38 sessions and workshops, plus the party
  • 60,000 livestream participants, in spite of the time differences around the world

I was blown away by the number of livestream attendees.

I don’t know what criteria Gilad Japhet will be using to determine “success” but I can’t imagine this conference being judged as anything but.

Let’s take a look at the second day. I spent part of the time talking to people and drifting in and out of the rear of several sessions for a few minutes. I meant to visit some of the workshops, but there was just too much good, distracting content elsewhere.

I began Sunday in Mike Mansfield’s presentation about SuperSearch. Yes, I really did attend a few sessions not about DNA, but my favorite was the session on Improved DNA Matching.

Improved DNA Matching

I’m sure it won’t surprise any of my readers that my favorite presentations were about the actual science of genetic genealogy.

Consumers don’t really need to understand the science behind autosomal results to reap the benefits, but the underlying science is part of what I love – and it’s important for me to understand the underpinnings to be able to unravel the fine points of what the resulting matches are and are not revealing. Misinterpretation of DNA results leading to faulty conclusions is a real issue in genetic genealogy today. Consequently, I feel that anyone working with other people’s results and providing advice really needs to understand how the science and technology together works.

Dr. Daphna Weissglas-Volkov, a population geneticist by training, although she clearly functions far beyond that scope today, gave a very interesting presentation about how MyHeritage handles (their greatly improved) DNA Matching. I’m hitting the high points here, but I would strongly encourage you to watch the video of this session when they are made available online.

In addition to Dr. Weissglas-Volkov’s slides, I’ve added some additional explanations and examples in various places. You can easily tell that the slides are hers and the graphics that aren’t MyHeritage slides are mine.

Dr. Weissglas-Volkov began the session by introducing the MyHeritage science team and then explaining terminology to set the stage.

A match is when two people match each other on a fairly long piece of DNA. Of course, “fairly long” is defined differently by each vendor.

Your genetic map (of your chromosomes) is comprised of the DNA you inherit from different ancestors by the process of recombination when DNA is transferred from the parents to the child. A centiMorgan is the relatively likelihood that a recombination will occur in a single generation. On average, 36 recombinations occur in each generation, meaning that the DNA is divided on any chromosome. However, women, for reasons unknown have about 1.5 times as many recombinations as men.

You can’t see that when looking at an example of a person compared to their parents, of course, because each individual is a full match to each parent, but you can see this visually when comparing a grandchild to their maternal grandmother and their paternal grandmother on a chromosome browser.

The above illustration is the same female grandchild compared to her maternal grandmother, at left, and her paternal grandmother at right. Therefore the number of crossovers at left is through a female child (her mother), and the number at right is through a male child (her father.)

# of Crossovers
Through female child – left 57
Through male child – right 22

There are more segments at left, through the mother, and the segments are generally shorter, because they have been divided into more pieces.

At right, fewer and larger segments through the father.

Keep in mind that because you have a strand of DNA from each parent, with exactly the same “street addresses,” that what is produced by DNA sequencing are two columns of data – but your Mom’s and Dad’s DNA is intermixed.

The information in the two columns can’t be identified as Mom’s or Dad’s DNA or strand at this point.

That interspersed raw data is called a genotype. A haplotype is when Mom’s and Dad’s DNA can be reassembled into “sides” so you can attribute the two letters at each address to either Mom or Dad.

Here’s a quick example.

The goal, of course, is to figure out how to reassemble your DNA into Mom’s side and Dad’s side so that we know that someone matching you is actually matching on all As (Mom) or all Gs (Dad,) in this example, and not a false match that zigzags back and forth between Mom and Dad.

The best way to accomplish that goal of course is trio phasing, when the child and both parents are available, so by comparing the child’s DNA with the parents you can assign the two strands of the child’s DNA.

Unfortunately, few people have both or even one parent available in order to actual divide their DNA into “sides,” so the next best avenue is statistical phasing. I’ve called this academic phasing in the past, as compared to parental phasing which MyHeritage refers to as trio phasing.

There’s a huge amount of confusion about phasing, with few people understanding there are two distinct types.

Statistical phasing is a type of machine learning where a large number of reference populations are studied. Since we know that DNA travels together in blocks when inherited, statistical phasing learns which DNA travels with which buddy DNA – and creates probabilities. Your DNA is then compared to these models and your DNA is reshuffled in order to assemble your DNA into two groups – one representing your Mom’s DNA and one representing your Dad’s DNA, according to statistical probability.

Looking at your genotype, if we know that As group together at those 6 addresses in my example 95% of the time, then we know that the most likely scenario to create a haplotype is that all of the As came from one parent and all of the Gs from the other parent – although without additional information, there is no way to yet assign the maternal and paternal identifier. At this point, we only know parent 1 and parent 2.

In order to train the computers (machine learning) to properly statistically phase testers’ results, MyHeritage uses known relationships of people to teach the machines. In other words, their reference panels of proven haplotypes grows all of the time as parent/child trios test.

Dr. Weissglas-Volkev then moved on to imputation.

When sequencing DNA, not every location reads accurately, so the missing values can be imputed, or “put back” using imputation.

Initially imputation was a hot mess. Not just for MyHeritage, but for all vendors, imputation having been forced upon them (and therefore us) by Illumina’s change to the GSA chip.

However, machine learning means that imputation models improve constantly, and matching using imputation is greatly improved at MyHeritage today.

Imputation can do more than just fill in blanks left by sequencing read errors.

The benefit of imputation to the genetic genealogy community is that vendors using disparate chips has forced vendors that want to allow uploads to utilize imputation to create a global template that incorporates all of the locations from each vendor, then impute the values they don’t actually test for themselves to complete the full template for each person.

In the example below, you can see that no vendor tests all available locations, but when imputation extends the sequences of all testers to the full 1-500 locations, the results can easily be compared to every other tester because every tester now has values in locations 1-500, regardless of which vendor/chip was utilized in their actual testing.

Therefore, using imputation, MyHeritage is able to match between quite disparate chips, such as the traditional Illumina chips (OmniExpress), the custom Ancestry chip and the new GSA chip utilized by 23andMe and LivingDNA.

So, how are matches determined?

Matching

First your DNA and that of another person are scanned for nearly identical seed sequences.

A minimum segment length of 6cM must be identified for further match processing to occur. Anything below 6cM is discarded at this point.

The match is then further evaluated to see if the seed match is of a high enough quality that it should be perfected and should count as a match. Other segments continue to be evaluated as well. If the total matching segment(s) is 8 total cM or greater, it’s considered a valid match. MyHeritage has taken the position that they would rather give you a few accidental false matches than to miss good matches. I appreciate that position.

Window cleaning is how they refer to the process of removing pileup regions known to occur in the human genome. This is NOT the same as Ancestry’s routine that removes areas they determine to be “too matchy” for you individually.

The difference is that in humans, for example, there is a segment of chromosome 6 where, for some reason, almost all humans match. Matching across that segment is not informative for genetic genealogy, so that region along with several others similar in nature are removed. At Ancestry, those genome-wide pileup segments are removed, along with other regions where Ancestry decides that you personally have too many matches. The problem is that for me, these “too matchy” segments are many of my Acadian matches. Acadians are endogamous, so lots of them match each other because as a small intermarried population, they share a great deal of the same DNA. However, to me, because I have one great-grandfather that’s Acadian, that “too matchy” information IS valuable although I understand that it wouldn’t be for someone that is 100% Acadian or Jewish.

In situations such as Ashkenazi Jewish matching, which is highly endogamous, MyHeritage uses a higher matching threshold. Otherwise every Ashkenazi person would match every other Ashkenazi person because they all descend from a small founder population, and for genealogy, that’s not useful.

The last step in processing matches is to establish the confidence level that the match is accurately predicted at the correct level – meaning the relationship range based on the amount of matching DNA and other criteria.

For example, does this match cluster with other proven matches of the same known relationship level?

From several confidence ascertainment steps, a confidence score is assigned to the predicted relationship.

Of course, you as a customer see none of this background processing, just the fact that you do match, the size of the match and the confidence score. That’s what genealogists need!

Matching Versus Triangulation Thresholds

Confusion exists about matching thresholds versus triangulation thresholds.

While any single segment must be over 6 cM in length for the matching process to begin, the actual match threshold at MyHeritage is a total of 8 cM.

I took a look at my lowest match at MyHeritage.

I have two segments, one 6.1 cM segment, and one 6 cM segment that match. It would appear that if I only had one 6 cM segment, it would not show as a match because I didn’t have the minimum 8 cM total.

Triangulation Threshold

However, after you pass that matching criteria and move on to triangulation with a matching individual, you have the option of selecting the triangulation threshold, which is not the same thing as the match threshold. The match threshold does not change, but you can change the triangulation threshold from 2 cM to 8 cM and selections in-between.

In the example below, I’m comparing myself against two known relatives.

You won’t be shown any matches below the 6 cM individual segment threshold, BUT you can view triangulated segments of different sizes. This is because matching segments often don’t line up exactly and the triangulated overlap between several individuals may be very small, but may still be useful information.

Flying your mouse over the location in the bubble, which is the triangulated segment, tells you the size of the triangulated portion. If you selected the 2 cM triangulation, you would see smaller triangulated portions of matches.

Closing Session

The conference was closed by Aaron Godfrey, a super-nice MyHeritage employee from the UK. The closing session is worth watching on the recorded livestream when it becomes available, in part because there are feel good moments.

However, the piece of information I was looking for was whether there will be a MyHeritage LIVE conference in 2019, and if so, where.

I asked Gilad afterwards and he said that they will be evaluating the feedback from attendees and others when making that decision.

So, if you attended or joined the livestream sessions and found value, please let MyHeritage know so that they can factor your feedback onto their decision. If there are topics you’d like to see as sessions, I’m sure they’d love to hear about that too. Me, I’m always voting for more DNA😊

I hope to hear about MyHeritage LIVE 2019, and I’m voting for any of the following locations:

  • Australia
  • New Zealand
  • Israel
  • Germany
  • Switzerland

What do you think?

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Ancestry Step by Step Guide: How to Upload-Download DNA Files

In this Upload-Download Series, we’ll cover each major vendor:

  • How to download raw data files from the vendor
  • How to upload raw data files to the vendor, if possible
  • Other mainstream vendors where you can upload this vendor’s files

Uploading TO Ancestry

This part is easy with Ancestry because Ancestry doesn’t accept any other vendor’s files. There is no ability to upload TO Ancestry. You have to test with Ancestry if you want DNA results from Ancestry.

Downloading FROM Ancestry

In order to upload your Ancestry autosomal DNA file to another testing vendor, or GedMatch, for either matching or ethnicity, you’ll need to first download the file from Ancestry. This doesn’t in any way affect your DNA matches at Ancestry. You’re only downloading a copy of the raw data file.

Step 1

Sign in to your account at Ancestry and click on the DNA Results Summary link.

Step 2

Click on the Settings gear, at the far upper right-hand corner of the summary page, just beneath your Ancestry user ID.

Step 3

Scroll way to the bottom and click on the link for “Download Raw DNA Data.”

Step 4

Enter your password and click on “I Understand,” after reading of course.

Then click “Confirm.”

Step 5

Ancestry will send an e-mail to the e-mail address where you are registered with Ancestry. Check your inbox for that e-mail.

Waiting…waiting.

Still waiting…

If the e-mail doesn’t arrive shortly, check your spam folder. If you’ve changed e-mail addresses, check to be sure your new one is registered with Ancestry. That’s on the same Settings page. If all else fails, request the e-mail again.

Step 6

Ahhh, it’s finally here.

Click on the green “Confirm Data Download” and do not close the window.

Step 7

Next, click on the green “Download DNA Data.”

You’ll see the following confirmation screen along with the downloaded file at the bottom.

Step 8

At the bottom of the page, above, if you’re on a PC, you’ll see the name of the zipped file.

The file name will be “dna-data-2021-07-31” where the date is the date you downloaded the file. I would suggest adding the word Ancestry to the front when you save the file on your system.

Most vendors want an unopened zip file, so if you want to open your file, first copy it to another name. Otherwise, you’ll have to download again.

That’s it, you’re done!

Ancestry DNA File Uploads to Other Vendors

Ancestry testing falls into two different categories. V1 tests taken before May of 2016 and V2, the current version as of August 2021 which includes tests taken after May 2016. Tests processed during May 2016 could be either version. However, the major vendors accept both files, so the version no longer matters.

The difference between V1 and V2 files is that Ancestry changed the chips they use to test and different DNA positions are tested, resulting in a file of a different format.

Not all vendors accept uploads, but you can upload your Ancestry DNA file, as follows:

From below to >>>>>>>>>>> Family Tree DNA Accepts ** MyHeritage Accepts*** 23andMe Accepts* GedMatch Accepts
Ancestry V1 and V2 Yes Yes No Yes

*Note that 23andMe in 2018 allowed a one-time upload from Ancestry, but people who uploaded results did not receive matches from 23andMe. You need to test at 23andMe.

**Note that the upload to Family Tree DNA and matching is free, but advanced tools including the chromosome browser and ethnicity require a $19 unlock fee. That fee is less expensive than retesting.

***MyHeritage provides free matching and basic tools. You’ll need either a $29 unlock or a full subscription to utilize all of the MyHeritage advanced DNA and genealogy tools. You can upload your DNA file here, and try the subscription for free, here.

Testing and Upload Strategy

My recommendation, if you test at Ancestry, is to upload your DNA file to MyHeritage, Family Tree DNA, and GedMatch.

I wrote step-by-step upload instructions for:

Have fun!

Please note that this article was updated in August 2021.

_____________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Uploads

Genealogy Products and Services

Books

Genealogy Research

2017 – The Year of DNA

Every year for the past 17 years has been the year of DNA for me, but for many millions, 2017 has been the year of DNA. DNA testing has become a phenomenon in its own right.

It was in 2013 that Spencer Wells predicted that 2014 would be the “year of infection.” Spencer was right and in 2014 DNA joined the ranks of household words. I saw DNA in ads that year, for the first time, not related to DNA testing or health as in, “It’s in our DNA.”

In 2014, it seemed like most people had heard of DNA, even if they weren’t all testing yet. John Q. Public was becoming comfortable with DNA.

In 2017 – DNA Is Mainstream  

If you’re a genealogist, you certainly know about DNA testing, and you’re behind the times if you haven’t tested.  DNA testing is now an expected tool for genealogists, and part of a comprehensive proof statement that meets the genealogical proof standard which includes “a reasonably exhaustive search.”  If you haven’t applied DNA, you haven’t done a reasonably exhaustive search.

A paper trail is no longer sufficient alone.

When I used to speak to genealogy groups about DNA testing, back in the dark ages, in the early 2000s, and I asked how many had tested, a few would raise their hands – on a good day.

In October, when I asked that same question in Ireland, more than half the room raised their hand – and I hope the other half went right out and purchased DNA test kits!

Consequently, because the rabid genealogical market is now pretty much saturated, the DNA testing companies needed to find a way to attract new customers, and they have.

2017 – The Year of Ethnicity

I’m not positive that the methodology some of the major companies utilized to attract new consumers is ideal, but nonetheless, advertising has attracted many new people to genetic genealogy through ethnicity testing.

If you’re a seasoned genetic genealogist, I know for sure that you’re groaning now, because the questions that are asked by disappointed testers AFTER the results come back and aren’t what people expected find their way to the forums that genetic genealogists peruse daily.

I wish those testers would have searched out those forums, or read my comparative article about ethnicity tests and which one is “best” before they tested.

More ethnicity results are available from vendors and third parties alike – just about every place you look it seems.  It appears that lots of folks think ethnicity testing is a shortcut to instant genealogy. Spit, mail, wait and voila – but there is no shortcut.  Since most people don’t realize that until after they test, ethnicity testing is becoming ever more popular with more vendors emerging.

In the spring, LivingDNA began delivering ethnicity results and a few months later, MyHeritage as well.  Ethnicity is hot and companies are seizing a revenue opportunity.

Now, the good news is that perhaps some of these new ethnicity testers can be converted into genealogists.  We just have to view ethnicity testing as tempting bait, or hopefully, a gateway drug…

2017 – The Year of Explosive Growth

DNA testing has become that snowball rolling downhill that morphed into an avalanche.  More people are seeing commercials, more people are testing, and people are talking to friends and co-workers at the water cooler who decide to test. I passed a table of diners in Germany in July to overhear, in English, discussion about ethnicity-focused DNA testing.

If you haven’t heard of DTC, direct to consumer, DNA testing, you’re living under a rock or maybe in a third world country without either internet or TV.

Most of the genetic genealogy companies are fairly closed-lipped about their data base size of DNA testers, but Ancestry isn’t.  They have gone from about 2 million near the end of 2016 to 5 million in August 2017 to at least 7 million now.  They haven’t said for sure, but extrapolating from what they have said, I feel safe with 7 million as a LOW estimate and possibly as many as 10 million following the holiday sales.

Advertising obviously pays off.

MyHeritage recently announced that their data base has reached 1 million, with only about 20% of those being transfers.

Based on the industry rumble, I suspect that the other DNA testing companies have had banner years as well.

The good news is that all of these new testers means that anyone who has tested at any of the major vendors is going to get lots of matches soon. Santa, it seems, has heard about DNA testing too and test kits fit into stockings!

That’s even better news for all of us who are in multiple data bases – and even more reason to test at all of the 4 major companies who provide autosomal DNA matching for their customers: Family Tree DNA, Ancestry, MyHeritage and 23andMe.

2017 – The Year of Vendor and Industry Churn

So much happened in 2017, it’s difficult to keep up.

  • MyHeritage entered the DNA testing arena and began matching in September of 2016. Frankly, they had a mess, but they have been working in 2017 to improve the situation.  Let’s just say they still have some work to do, but at least they acknowledge that and are making progress.
  • MyHeritage has a rather extensive user base in Europe. Because of their European draw, their records collections and the ability to transfer results into their data base, they have become the 4th vendor in a field that used to be 3.
  • In March 2017, Family Tree DNA announced that they were accepting transfers of both the Ancestry V2 test, in place since May of 2016, along with the 23andMe V4 test, available since November 2013, for free. MyHeritage has since been added to that list. The Family Tree DNA announcement provided testers with another avenue for matching and advanced tools.
  • Illumina obsoleted their OmniExpress chip, forcing vendors to Illumina’s new GSA chip which also forces vendors to use imputation. I swear, imputation is a swear word. Illumina gets the lump of coal award for 2017.
  • I wrote about imputation here, but in a nutshell, the vendors are now being forced to test only about 20% of the DNA locations available on the previous Illumina chip, and impute or infer using statistics the values in the rest of the DNA locations that they previously could test.
  • Early imputation implementers include LivingDNA (ethnicity only), MyHeritage (to equalize the locations of various vendor’s different chips), DNA.Land (whose matching is far from ideal) and 23andMe, who seems, for the most part, to have done a reasonable job. Of course, the only way to tell for sure at 23andMe is to test again on the V5 chip and compare to V3 and V4 chip matches. Given that I’ve already paid 3 times to test myself at 23andMe (V2, 3 and 4), I’m not keen on paying a 4th time for the V5 version.
  • 23andMe moved to the V5 Illumina GSA chip in August which is not compatible with any earlier chip versions.
  • Needless to say, the Illumina chip change has forced vendors away from focusing on new products in order to develop imputation code in order to remain backwards compatible with their own products from an earlier chip set.
  • GedMatch introduced their sandbox area, Genesis, where people can upload files that are not compatible with the traditional vendor files.  This includes the GSA chip results (23andMe V5,) exome tests and others.  The purpose of the sandbox is so that GedMatch can figure out how to work with these files that aren’t compatible with the typical autosomal test files.  The process has been interesting and enlightening, but people either don’t understand or forget that it’s a sandbox, an experiment, for all involved – including GedMatch.  Welcome to living on the genetic frontier!

  • I assembled a chart of who loves who – meaning which vendors accept transfers from which other vendors.

  • I suspect but don’t know that Ancestry is doing some form of imputation between their V1 and V2 chips. About a month before their new chip implementation in May of 2016, Ancestry made a change in their matching routine that resulting in a significant shift in people’s matches.

Because of Ancestry’s use of the Timber algorithm to downweight some segments and strip out others altogether, it’s difficult to understand where matching issues may arise.  Furthermore, there is no way to know that there are matching issues unless you and another individual have transferred results to either Family Tree DNA or GedMatch, neither of which remove any matching segments.

  • Other developments of note include the fact that Family Tree DNA moved to mitochondrial DNA build V17 and updated their Y DNA to hg38 of the human reference genome – both huge undertakings requiring the reprocessing of customer data. Think of both of those updates as housekeeping. No one wants to do it, but it’s necessary.
  • 23andMe FINALLY finished transferring their customer base to the “New Experience,” but many of the older features we liked are now gone. However, customers can now opt in to open matching, which is a definite improvement. 23andMe, having been the first company to enter the genetic genealogy autosomal matching marketspace has really become lackluster.  They could have owned this space but chose not to focus on genealogy tools.  In my opinion, they are now relegated to fourth place out of a field of 4.
  • Ancestry has updated their Genetic Communities feature a couple of times this year. Genetic Communities is interesting and more helpful than ethnicity estimates, but neither are nearly as helpful as a chromosome browser would be.

  • I’m sure that the repeated requests, begging and community level tantrum throwing in an attempt to convince Ancestry to produce a chromosome browser is beyond beating a dead horse now. That dead horse is now skeletal, and no sign of a chromosome browser. Sigh:(
  • The good news is that anyone who wants a chromosome browser can transfer their results to Family Tree DNA or GedMatch (both for free) and utilize a chromosome browser and other tools at either or both of those locations. Family Tree DNA charges a one time $19 fee to access their advanced tools and GedMatch offers a monthly $10 subscription. Both are absolutely worth every dime. The bad news is, of course, that you have to convince your match or matches to transfer as well.
  • If you can convince your matches to transfer to (or test at) Family Tree DNA, their tools include phased Family Matching which utilizes a combination of user trees, the DNA of the tester combined with the DNA of family matches to indicate to the user which side, maternal or paternal (or both), a particular match stems from.

  • Sites to keep your eye on include Jonny Perl’s tools which include DNAPainter, as well as Goran Rundfeldt’s DNA Genealogy Experiment.  You may recall that in October Goran brought us the fantastic Triangulator tool to use with Family Tree DNA results.  A few community members expressed concern about triangulation relative to privacy, so the tool has been (I hope only temporarily) disabled as the involved parties work through the details. We need Goran’s triangulation tool! Goran has developed other world class tools as well, as you can see from his website, and I hope we see more of both Goran and Jonny in 2018.
  • In 2017, a number of new “free” sites that encourage you to upload your DNA have sprung up. My advice – remember, there really is no such thing as a free lunch.  Ask yourself why, what’s in it for them.  Review ALL OF THE documents and fine print relative to safety, privacy and what is going to be done with your DNA.  Think about what recourse you might or might not have. Why would you trust them?

My rule of thumb, if the company is outside of the US, I’m immediately slightly hesitant because they don’t fall under US laws. If they are outside of Europe or Canada, I’m even more hesitant.  If the company is associated with a country that is unfriendly to the US, I unequivocally refuse.  For example, riddle me this – what happens if a Chinese (or fill-in-the-blank country) company violates an agreement regarding your DNA and privacy?  What, exactly, are you going to do about it from wherever you live?

2017 – The Year of Marketplace Apps

Third party genetics apps are emerging and are beginning to make an impact.

GedMatch, as always, has continued to quietly add to their offerings for genetic genealogists, as had DNAGedcom.com. While these two aren’t exactly an “app”, per se, they are certainly primary players in the third party space. I use both and will be publishing an article early in 2018 about a very useful tool at DNAGedcom.

Another application that I don’t use due to the complex setup (which I’ve now tried twice and abandoned) is Genome Mate Pro which coordinates your autosomal results from multiple vendors.  Some people love this program.  I’ll try, again, in 2018 and see if I can make it all the way through the setup process.

The real news here are the new marketplace apps based on Exome testing.

Helix and their partners offer a number of apps that may be of interest for consumers.  Helix began offering a “test once, buy often” marketplace model where the consumer pays a nominal price for exome sequencing ($80), significantly under market pricing ($500), but then the consumer purchases DNA apps through the Helix store. The apps access the original DNA test to produce results. The consumer does NOT receive their downloadable raw data, only data through the apps, which is a departure from the expected norm. Then again, the consumer pays a drastically reduced price and downloadable exome results are available elsewhere for full price.

The Helix concept is that lots of apps will be developed, meaning that you, the consumer, will be interested and purchase often – allowing Helix to recoup their sequencing investment over time.

Looking at the Helix apps that are currently available, I’ve purchased all of the Insitome products released to date (Neanderthal, Regional Ancestry and Metabolism), because I have faith in Spencer Wells and truthfully, I was curious and they are reasonably priced.

Aside from the Insitome apps, I think that the personalized clothes are cute, if extremely overpriced. But what the heck, they’re fun and raise awareness of DNA testing – a good thing! After all, who am I to talk, I’ve made DNA quilts and have DNA clothing too.

Having said that, I’m extremely skeptical about some of the other apps, like “Wine Explorer.”  Seriously???

But then again, if you named an app “I Have More Money Than Brains,” it probably wouldn’t sell well.

Other apps, like Ancestry’s WeRelate (available for smartphones) is entertaining, but is also unfortunately EXTREMELY misleading.  WeRelate conflates multiple trees, generally incorrectly, to suggest to you and another person on your Facebook friends list are related, or that you are related to famous people.  Judy Russell reviews that app here in the article, “No, actually, we’re not related.” No.  Just no!

I feel strongly that companies that utilize our genetic data for anything have a moral responsibility for accuracy, and the WeRelate app clearly does NOT make the grade, and Ancestry knows that.  I really don’t believe that entertaining customers with half-truths (or less) is more important than accuracy – but then again, here I go just being an old-fashioned fuddy dud expecting ethics.

And then, there’s the snake oil.  You knew it was going to happen because there is always someone who can be convinced to purchase just about anything. Think midnight infomercials. The problem is that many consumers really don’t know how to tell snake oil from the rest in the emerging DNA field.

You can now purchase DNA testing for almost anything.  Dating, diet, exercise, your taste in wine and of course, vitamins and supplements. If you can think of an opportunity, someone will dream up a test.

How many of these are legitimate or valid?  Your guess is as good as mine, but I’m exceedingly suspicious of a great many, especially those where I can find no legitimate scientific studies to back what appear to be rather outrageous claims.

My main concern is that the entire DTC testing industry will be tarred by the brush of a few unethical opportunists.

2017 – The Year of Focus on Privacy and Security

With increased consumer exposure comes increased notoriety. People are taking notice of DNA testing and it seems that everyone has an opinion, informed or not.  There’s an old saying in marketing; “Talk about me good, talk about me bad, just talk about me.”

With all of the ads have come a commensurate amount of teeth gnashing and “the-sky-is-falling” type reporting.  Unfortunately, many politicians don’t understand this industry and open mouth only to insert foot – except that most people don’t realize what they’ve done.  I doubt that the politicians even understand that they are tasting toe-jam, because they haven’t taken the time to research and understand the industry. Sound bites and science don’t mix well.

The bad news is that next, the click-bait-focused press picks up on the stories and the next time you see anyone at lunch, they’re asking you if what they heard is true.  Or, let’s hope that they ask you instead of just accepting what they heard as gospel. Hopefully if we’ve learned anything in this past year, it’s to verify, verify, verify.

I’ve been an advocate for a very long time of increased transparency from the testing companies as to what is actually done with our DNA, and under what circumstances.  In other words, I want to know where my DNA is and what it’s being used for.  Period.

Family Tree DNA answered that question succinctly and unquestionably in December.

Bennett Greenspan: “We could probably make a lot of money by selling the DNA data that we’ve been collecting over the years, but we feel that the only person that should have your DNA information is you.  We don’t believe that it should be sold, traded or bartered.”

You can’t get more definitive than that.

DTC testing for genetic genealogy must be a self-regulating field, because the last thing we need is for the government to get involved, attempting to regulate something they don’t understand.  I truly believe government interference by the name of regulation would spell the end of genetic genealogy as we know it today.  DNA testing for genetic genealogy without sharing results is entirely pointless.

I’ve written about this topic in the past, but an update is warranted and I’ll be doing that sometime after the first of the year.  Mostly, I just need to be able to stay awake while slogging through the required reading (at some vendor sites) of page after page AFTER PAGE of legalese😊

Consumers really shouldn’t have to do that, and if they do, a short, concise summary should be presented to them BEFORE they purchase so that they can make a truly informed decision.

Stay tuned on this one.

2017 – The Year of Education

The fantastic news is that with all of the new people testing, a huge, HUGE need for education exists.  Even if 75% of the people who test don’t do anything with their results after that first peek, that still leaves a few million who are new to this field, want to engage and need some level of education.

In that vein, seminars are available through several groups and institutes, in person and online.  Almost all of the leadership in this industry is involved in some educational capacity.

In addition to agendas focused on genetic genealogy and utilizing DNA personally, almost every genealogy conference now includes a significant number of sessions on DNA methods and tools. I remember the days when we were lucky to be allowed one session on the agenda, and then generally not without begging!

When considering both DNA testing and education, one needs to think about the goal.  All customer goals are not the same, and neither are the approaches necessary to answer their questions in a relevant way.

New testers to the field fall into three primary groups today, and their educational needs are really quite different, because their goals, tools and approaches needed to reach those goals are different too.

Adoptees and genealogists employ two vastly different approaches utilizing a common tool, DNA, but for almost opposite purposes.  Adoptees wish to utilize tests and trees to come forward in time to identify either currently living or recently living people while genealogists are interested in reaching backward in time to confirm or identify long dead ancestors. Those are really very different goals.

I’ve illustrated this in the graphic above.  The tester in question uses their blue first cousin match to identify their unknown parent through the blue match’s known lineage, moving forward in time to identify the tester’s parent.  In this case, the grandparent is known to the blue match, but not to the yellow tester. Identifying the grandparent through the blue match is the needed lynchpin clue to identify the unknown parent.

The yellow tester who already knows their maternal parent utilizes their peach second cousin match to verify or maybe identify their maternal great-grandmother who is already known to the peach match, moving backwards in time. Two different goals, same DNA test.

The three types of testers are:

  • Curious ethnicity testers who may not even realize that at least some of the vendors offer matching and other tools and services.
  • Genealogists who use close relatives to prove which sides of trees matches come from, and to triangulate matching segments to specific ancestors. In other words, working from the present back in time. The peach match and line above.
  • Adoptees and parent searches where testers hope to find a parent or siblings, but failing that, close relatives whose trees overlap with each other – pointing to a descendant as a candidate for a parent. These people work forward in time and aren’t interested in triangulation or proving ancestors and really don’t care about any of those types of tools, at least not until they identify their parent.  This is the blue match above.

What these various groups of testers want and need, and therefore their priorities are different in terms of their recommendations and comments in online forums and their input to vendors. Therefore, you find Facebook groups dedicated to Adoptees, for example, but you also find adoptees in more general genetic genealogy groups where genealogists are sometimes surprised when people focused on parent searches downplay or dismiss tools such as Y DNA, mitochondrial DNA and chromosome browsers that form the bedrock foundation of what genealogists need and require.

Fortunately, there’s room for everyone in this emerging field.

The great news is that educational opportunities are abundant now. I’m listing a few of the educational opportunities for all three groups of testers, in addition to my blog of course.😊

Remember that this blog is fully searchable by keyword or phrase in the little search box in the upper right hand corner.  I see so many questions online that I’ve already answered!

Please feel free to share links of my blog postings with anyone who might benefit!

Note that these recommendations below overlap and people may well be interested in opportunities from each group – or all!!

Ethnicity

Adoptees or Parent Search

Genetic Genealogists

2018 – What’s Ahead? 

About midyear 2018, this blog will reach 1000 published articles. This is article number 939.  That’s amazing even to me!  When I created this blog in July of 2012, I wasn’t sure I’d have enough to write about.  That certainly has changed.

Beginning shortly, the tsunami of kits that were purchased during the holidays will begin producing matches, be it through DNA upgrades at Family Tree DNA, Big Y tests which were hot at year end, or new purchases through any of the vendors.  I can hardly wait, and I have my list of brick walls that need to fall.

Family Tree DNA will be providing additional STR markers extracted from the Big Y test. These won’t replace any of the 111 markers offered separately today, because the extraction through NGS testing is not as reliable as direct STR testing for those markers, but the Big Y will offer genealogists a few hundred more STRs to utilize. Yes, I said a few hundred. The exact number has not yet been finalized.

Family Tree DNA says they will also be introducing new “qualify of life improvements” along with new privacy and consent settings.  Let’s hope this means new features and tools will be released too.

MyHeritage says that they are introducing new “Discoveries” pages and a chromosome browser in January.  They have also indicated that they are working on their matching issues.  The chromosome browser is particularly good news, but matching must work accurately or the chromosome browser will show erroneous information.  Let’s hope January brings all three features.

LivingDNA indicates that they will be introducing matching in 2018.

2018 – What Can You Do?

What can you do in 2018 to improve your odds of solving genealogy questions?

  • Test relatives
  • Transfer your results to as many data bases as possible (among the ones discussed above, after reading the terms and conditions, of course)
  • If you have transferred a version of your DNA that does not produce full results, such as the Ancestry V2 or 23andMe V4 test to Family Tree DNA, consider testing on the vendor’s own chip in order to obtain all matches, not just the closest matches available from an incompatible test transfer.
  • Test Y and mitochondrial DNA at Family Tree DNA.
  • Find ways to share the stories of your ancestors.  Stories are cousin bait.  My 52 Ancestors series is living proof.  People find the stories and often have additional facts, information or even photos. Some contacts qualify for DNA testing for Y or mtDNA lines. The GREAT NEWS is that Amy Johnson Crow is resuming the #52Ancestors project for 2018, providing hints and tips each week! Who knows what you might discover by sharing?! Here’s how to start a blog if you need some assistance.  It’s easy – really!
  • Focus on the brick walls that you want to crumble and then put together both a test and analysis plan. That plan could include such things as:

o   Find out if a male representing a Y line in your tree has tested, and if not, search through autosomal results to see if a male from that paternal surname line has tested and would be amenable to an upgrade.

o   Mitochondrial DNA test people who descend through all females from various female ancestors in order to determine their origins. Y and mtDNA tests are an important part of a complete genealogy story – meaning the reasonably exhaustive search!

o   Autosomal DNA test family members from various lines with the hope that matches will match you and them both.

o   Test family members in order to confirm a particular ancestor – preferably people who descend from another child of that ancestor.

o   Making sure your own DNA is in all 4 of the major vendors’ data bases, plus GedMatch. Look at it this way, everyone who is at GedMatch or at a third party (non-testing) site had to have tested at one of the major 4 vendors – so if you are in all of the vendor’s data bases, plus GedMatch, you’re covered.

Have a wonderful New Year and let’s make 2018 the year of newly discovered ancestors and solved mysteries!

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

DNA.Land

DNA.Land first launched in October of 2015, a free upload site whose goal is to encourage sharing to enable scientists to make new discoveries including the initiative to understand what is needed for a cure for breast cancer by 2020.

Their purpose, as stated by DNA.Land in their FAQ:

DNA.Land is a place where you can learn more about your genome while enabling scientists to make new genetic discoveries for the benefit of humanity. Our goal is to help members to interpret their data and to enable their contribution to research.

DNA.Land has invested a lot of effort into providing tools for genetic genealogists in order to encourage them to upload their autosomal DNA testing results to DNA.Land and participate in research in exchange for having access to their tools.

Let’s step through the process and take a look at their offerings.

If you’re interested in participating, the first thing to do is to register and the next step is the consent process.

Consent

If you are considering participation, or uploading your DNA to utilize their ethnicity or matching services, you must sign their consent form. Needless to say, you need to fully read the consent form before clicking to authorize, at DNA.Land and anyplace else.

Please note that you can click on any image to enlarge.

Upload Your File

After you click to approve and continue, you’ll be asked to select a file to upload. I chose Family Tree DNA Build 37.

Research Questions

Given that the focus of DNA.Land is medical research, you’ll be asked questions about yourself and your ancestry, such as your birthdate, as well as that of your parents.

I joined the Breast Cancer research and authorized researchers to contact me.

You are then asked, “Is this file your file?” DNA.Land wants to be absolutely sure you are providing information for your own file, and not someone else’s.

DNA.Land then asks questions related to your family and breast cancer. I answered the questions, agreed to be contacted if there are questions and joined the study.

You’ll answer questions about whether your parent, full siblings or children have been diagnosed with breast cancer, as well as questions about yourself.

I was excited to see that I was the 7,456th person to join the breast cancer initiative, but then I realized that their goal is 25,000 by the end of 2017. They have a LONG way to go. Please consider joining.

Your Personal Page

Your personal page includes your file status, the research projects in which you are participating as well as reports available.

Your file status is shown at the bottom of the page, including links to learn more.

About Imputation

DNA.Land was the first vendor to attempt imputation. I wrote about imputation in the article, Concepts – Imputation. I also wrote about matching with a vendor who utilizes imputation in the article Imputation Matching Comparison.

Imputation affects your matches, segment sizes and the quality of those matches. If you’re not familiar with imputation, I would strongly suggest reading these articles now.

While I’m incredibly supportive of the breast cancer and research initiatives, I’m less excited about the accuracy of imputation relative to genetic genealogy. Let’s take a look.

My Reports

Now that I’m done with setup and questions, I’m ready to view information about my own DNA results according to DNA.Land. Remember that these results include imputed information, meaning data that was imputed to be mine in regions not tested based on my DNA in regions that have been tested. My Family Tree DNA file that I uploaded held over 700,000 tested locations, and DNA.Land imputes another 38 million locations based on the 700,000 that were actually tested.

You can select from various My Reports options:

  • Find Relatives
  • Find Relatives of Relatives
  • Ancestry Report
  • Trait Prediction Report

Let’s look at each one.

Find Relatives

As of today, just over 70,000 individuals have uploaded, an increase of 10,000 in just under two months, so the site is rapidly growing.

The first page is DNA Relationship Matches. The match below is my closest match to cousin, Karen. I wrote about dissecting this match in the article Imputation Matching Comparison.

You can show or hide the chromosome table at far right. Segments are divided into recent and ancient based on the segment size. I’m not sure I would have used the term “ancient,” but what DNA.Land is trying to convey is that more often, smaller segments are older than larger segments.

I have 11 High Certainty matches and 1 speculative.

The information page explains more. Click on the “Learn more about the report” link in the upper left hand corner, which displays the following example information.

All reported segments are 3.00 cM or larger.

Very beneficially, my closest match, Karen, showed her GedMatch kit number as her middle name. I utilized her file at GedMatch and her results at DNA.Land to compare raw data file matching and imputed file matching. You can read about the findings in the article, Imputation Matching Comparison.

Based on imputed matching, I’m not sure that today I would have much confidence in matches to the relatives of relatives, but let’s take a look anyway.

Find Relatives of Relatives

Relative of relatives is a big confusing.  Think if it as an alternate to a chromosome browser.  Here’s what their information page says about this feature.

This is a bit confusing. The “via” relative is the person on your match report.

The first person listed, or the “endpoint” relative is the person related to them.

The intersection is the set of intersecting matching segments between you, your match and their match that (apparently) also matches you, or they would not be on this report.

Here’s a Relatives of Relatives match with my strongest match, Karen.

The problem is that the person shown as Karen’s match, Shelley, is not shown as my match.  The common matching segments between the three of us, shown above and below, are very small.  Even though Shelley is a match to Karen, Shelley apparently only matches me on smaller segments, not large enough to pass the DNA.Land threshold for a match.

The problem is that all of the above matching and triangulating segments above are imputed segments and don’t show up as legitimate matches at GedMatch between me and Karen, so they can’t be a valid three way match between me, Karen and Shelley.

In other words, these aren’t valid matches at all, even before the discussion about whether they are identical by descent, chance or population.  Therefore, these have to be matches on imputed regions, not through actual testing.

The certainty field is also confusing.  I initially though that the “high” certainty pertained to the three way match certainty, but it doesn’t.  Certainty means the certainty of the match between your match (the via relative) and the endpoint (their match) and has nothing to do with the certainty of the segments matching the three of you being relevant.

If you’d like to utilize this information, please read the information pages VERY CAREFULLY and be sure you understand what the information, is, and isn’t, telling you.

Ancestry Report (Ethnicity)

The Ancestry report is DNA.Land’s ethnicity report.

Looking at the map, it’s difficult to compare the DNA.Land results to other vendors, because they have Scandinavia divided into half, with the westernmost part of Scandinavia included in their Northwest Europe orange grouping, the light green designated as Finnish with the olive green as North Slavic. Other vendors include Norway and all of Sweden as part of Scandinavia.

One nice thing is that the population reference locations are shown on the map below, even for non-matching reference groups.

In my case, DNA.Land missed my Native American entirely.

The chart below represents my known and proven genealogy as compared to the DNA.land ethnicity results.

You can see how DNA.Land stacks up against the rest of the vendors, below.

Trait Prediction Report

The trait report requires an additional consent form. In essence, DNA.Land wants to make sure you really want to see your traits, that you understand what you are going to see and that you understand how traits are calculated and displayed.

DNA.land offers several traits you can select from.

But there’s a hitch.

Before you can see your traits, you get to answer a survey. In all fairness, DNA.Land’s purpose is medical research, and the reports participants receive are free.

My eye color is accurate, BUT, I also just told them that my eye color is dark brown during the questions. Not terribly confidence inspiring – but my confidence increased  after reviewing all of the information they provided about the science behind my actual trait prediction.

The eye color map, above, is something unique I haven’t seen elsewhere. I find this kind of information quite interesting.

Even though I did provide DNA.Land with the “brown eyes” answer, this chart makes me feel much better, because they shared the science behind my result with me. Therefore, I now feel much better, because, based on the science, it’s apparent that they didn’t just parrot my result back to me.

There is also a “what if my result is wrong” link. After all, science is all about continuing to learn and to think we know everything there is to know about genetics is foolhearty.

Yea, I like this a LOT!

If you’d like to read more about how genetic research takes place, read the interesting article titled Is there a Firefox Gene? Yes, that’s the Firefox browser, and yes, this is a real study. Take a look. It’s really quite interesting and written in plain English.

Summary

DNA.Land has a different purpose than other DNA matching and ethnicity sites. As a nonprofit, DNA.Land offers their matching and ethnicity services as an enticement to genetic genealogists who have paid to test elsewhere to upload their results to DNA.land and in doing so, to participate in medical research.

DNA.Land is absolutely up front about their mission. The features are “complimentary,” so to speak, meant to be enticements to consumers to participate and contribute their DNA results.

Given that, it’s difficult to be terribly upset with DNA.Land’s features and services.

DNA.Land has a nice user interface and some nice display features. Their eye color mapping isn’t found elsewhere, and other similar features would make great teaching tools. Their help pages are informative and educational.

Imputation concerns me. Imputation for medical research doesn’t directly affect me today, although it may someday, given that imputed data is used for research.

Imputed data does affect your results at Promethease if you choose to utilize your imputed results as input for any application that reports your academic and/or medical mutations. You can read about that in the article, Imputation Analysis Using Promethease.

Imputation affects matching for genetic genealogy negatively. While I didn’t discuss matching quality in this article, I did in the article Imputation Matching Comparison, which I would encourage you to read if you are attempting to utilize the DNA.Land matching function seriously for genealogy. I would encourage genetic genealogists to simply match at the vendor where they tested, or at Family Tree DNA which accepts uploads (Ancestry V1, V2 and 23andMe V3, V4) from other vendors, or at GedMatch for serious match analysis.

My suggestion to DNA.Land for matching would be to eliminate the smaller segments entirely, especially if they are a result of imputation and not actual matching DNA segments. In my limited experiment, DNA.Land seemed to do relatively well on matching and utilizing larger segments.

Ethnicity results at DNA.Land, called Ancestry Results, are divided oddly, with Northwestern Europe including all of the British Isles, western Scandinavia along with the northwest quadrant of continental Europe. This division makes it extremely difficult to compare to other vendors’ results.

DNA.Land seems to report an unrealistic amount of Southern European, but again, it’s somewhat difficult to tell where the dividing line occurs. It would be easier if their ethnicity map were overlayed on a current map of Europe showing country boundaries. DNA.Land missed my Native entirely.

It would be interesting to know how much of the ethnicity results are calculated on actual DNA and how much through imputation. Ethnicity results tend to be dicey enough in the industry as a whole without adding the uncertainty of imputation on top. Having said that, given how popular ethnicity testing has become, offering another ethnicity opinion is probably a large draw for attracting people to upload and participate in research at DNA.Land.

Some of the trait information is quite interesting and new traits will probably be equally so, although I wonder how much of that information is imputed as well. In other words, I don’t know if the results are actually “mine” through testing or could be in error. The good news is that DNA.Land provides the genetic locations where the trait analysis is compiled, allowing you to utilize a service like Promethease which provides the ability in some cases to confirm imputed data if you upload your actual tested files from testing vendors.

For all results, I would very much like to see a toggle where you can toggle between actual match results and match results derived from imputation.

I would also like to see some research about the accuracy of imputation as compared to non-imputed results. Clearly this would be available through research efforts like my own at Promethease, exome and full genome sequencing.

In a nutshell, DNA.Land provides an interesting free service so long as you don’t want to take the results terribly seriously for genealogy research. If any of the results are important or you want to depend upon them for accuracy, verify elsewhere with actual tested data.

It’s important to remember at DNA.Land that their real goal isn’t to provide a product or to compete with the testing vendors. Their features are a “thank you” or enticement for consumers to contribute their autosomal data for medical research, some of which may be “for profit.”  Companies aren’t going to participate in research initiatives that don’t hold the potential for profit.

I really didn’t need an enticement, but I’m grateful nonetheless.

Additionally, DNA.Land has provided an important first foray into imputation and allowed us to compare imputed data with tested data. I know that wasn’t their goal, but I’m glad to have the opportunity to learn and work with real life examples. My own. I would encourage you to do the same.

Be Part of the Cure

The last thing I have to say is that I truly hope and pray that the Breast Cancer Deadline shown as 2020 is a real and achievable goal.

I welcome the opportunity for anything I can to do help eliminate that horrific scourge that has affected so many women. Breast cancer has taken the lives of my family members and friends, as I’m sure it has yours, and I would like nothing better than to participate in some small way in wiping it off the face of the earth. DNA.Land is one way you can help, and it costs you absolutely nothing.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Imputation Analysis Utilizing Promethease

We know in the genetics industry that imputation is either coming or already here for genetic genealogy. I recently wrote two articles, here and here, explaining imputation and its (apparent) effects on matching – or at least the differences between vendors who do and don’t utilize imputation on the segments that are set forth as matches.

I will be writing shortly about my experience utilizing DNA.Land, a vendor who encourages testers to upload their files to be shared with medical researchers. In return, DNA.Land provides matching information and ethnicity – but they do impute results that you don’t have based on“typical” DNA that is generally inherited with the DNA you do have.

Aside from my own curiosity and interest in health, I have been attempting to determine the relative accuracy of imputation.

Promethease is a third party site that provides consumers who upload their autosomal DNA files with published information about their SNPs, mutations, either bad, good or neither, meaning just information. This makes Promethease the perfect avenue for comparing the accuracy of the imputed data provided by DNA.Land compared against the data provided by Promethease generated from files from vendors who do not impute.

Even better, I can directly compare the autosomal file from Family Tree DNA that I uploaded to DNA.Land with my resulting DNA.Land file after DNA.Land imputed another 38 million locations. I can also compare the DNA.Land results to an extensive exome test that provided results for some 50 million locations.

Uploading all of the files from various testing vendors separately to Promethease allows me to see which of the mutations imputed by DNA.Land are accurate when compared to actual DNA tests, and if the imputed mutations are accurate when the same location was tested by any vendor.

In addition to the typical genetic genealogy vendors, I’ve also had my DNA exome sequenced, which includes the 50 million locations in humans most likely to mutate.  This means those locations should be the locations most likely to be imputed by DNA.Land.

Finally, at Promethease, I can combine my results from all the vendors where I actually tested to provide the greatest coverage of actually tested locations, and then compare to DNA.Land – providing the most comprehensive comparison.

I will utilize the testing vendors’ actual results to check the DNA.Land imputed results.

Let’s see what the results produce.

The Test Process

The method I used for this comparison was to upload my Family Tree DNA autosomal raw data file to DNA.Land. DNA.Land then took the 700,000+ locations that I did test for at Family Tree DNA, and imputed more than 38 million additional locations, raising my tested and imputed number of locations to about 39 million.

Then, I downloaded and uploaded my huge DNA.Land file, utilizing the Promethease instructions.

In order to do a comparison against the imputed data that DNA.Land provided, I uploaded files from the following vendors individually, one at a time, to Promethease to see which versions of the files provided which results – meaning which mutations the files produced by actual testing at vendors could confirm in the DNA.Land imputed results.

  • DNA.Land (imputed)
  • Genos – Exome testing of 50 million medically relevant locations
  • Ancestry V1 test
  • Ancestry V2 test
  • Family Tree DNA
  • 23andMe V3 test
  • 23andMe V4 test
  • Combined file of all non-imputed vendor files

Promethease provides a wonderful feature that enables users to combine multiple vendors’ files into one run. As a final test, I combined all of my non-imputed files into one run in order to compare all of my non-imputed results, together, with DNA.Land’s imputed results.

Promethease provides results that fall into 3 categories:

  • Bad – red
  • Good – green
  • Grey – “not set” – neither bad nor good, just information

Promethease does not provide diagnoses of any form, just information from the published literature about various mutations and genetic markers and what has been found in research, with links to the sources through SNPedia.

Results

I compiled the following chart with the results of each individual file, plus a combined file made up of all of the non-imputed files.

The results are quite interesting.

The combined run that included all of the vendors files except for DNA.Land provided more “bad” results than the imputed DNA.Land file. 

I expected that the Genos exome test would have covered all of the locations tested by the three genetic genealogy vendors, but clearly not, given that the combined run provides more results than the Genos exome run by itself. In fact, the total locations reported is 80,607 for the combined run and the Genos run alone was only 45,595.

DNA.Land only imputed 34,743 locations that returned results.

Comparison for Accuracy

Now, the question is whether the DNA.Land imputed results are accurate.

Due to the sheer number of results, I focused only on the “bad” results, the ones that would be most concerning, to get an idea of how many of the DNA.Land results were tested in the original uploaded file (from FTDNA) and how many were imputed. Of the imputed locations, I determined how many are accurate by comparing the DNA.Land results to the combined testing results. My hope, is, of course, that most of the locations found in the DNA.Land imputed file are also to be found in one of the files tested at the vendors, and therefore covered in the combined file run.

I combined my results from the following 3 runs into a common spreadsheet, color coding each result differently:

  • First, I wanted to see the locations reported as “bad” that were actually tested at FTDNA. By comparing the FTDNA locations with the DNA.Land imputed file, we know that DNA.Land was NOT imputing those locations, and conversely, that they WERE imputing the rest of the locations.
  • Second, I wanted to know if locations imputed by DNA.Land and reported as “bad” had been tested by any testing company, and if DNA.Land’s imputation was accurate as compared to an actual test.

You can read more about how Promethease reports results, here.

I’m showing two results in the spreadsheet example, below.

White row=FTDNA test result
Yellow row =DNA.Land result
Blue row=combined test result

These two examples show two mutations that are ranked as “bad” for the same condition. This result really only tells me that I metabolize some things slower than other people. Reading the fine print tells me this as well:

The proportion of slow and rapid metabolizers is known to differ between different ethnic populations. In general, the slow metabolizer phenotype is most prevalent (>80%) in Northern Africans and Scandinavians, and lowest (5%) in Canadian Eskimos and Japanese. Intermediate frequencies are seen in Chinese populations (around 20% slow metabolizers), whereas 40 – 60% of African-Americans and most non-Scandinavian Caucasians are slow metabolizers.[PMID 16416399]

Many of you are probably slow metabolizers too.

I used this example to illustrate that not everything that is “bad” is going to keep you awake at night.

The first mutation, gs140 is found in the DNA.Land file, but there is no corresponding white row, representing the original Family Tree DNA report, meaning that DNA.Land imputed the result. GS140 is, however, tested by some vendor in the combined file. The results do match (verified by actually comparing the results individually) and therefore, the DNA.Land imputation was accurate as noted in the DNA.Land Analysis column at far right.

In the second example, gs154 is reported by DNA.Land, but since it’s also reported by Family Tree DNA in the white row, we know that this value was NOT imputed by DNA.Land, because this was part of the originally uploaded file. Therefore, in the Analysis column, I labeled this result as “tested at FTDNA.”

Analysis

I analyzed each of the rows of “bad” results found in the DNA.Land file by comparing them first to the FTDNA file and then the Combined file. In some cases, I needed to return to the various vendor results to see which vendor had done the testing on a specific location in order to verify the result from the individual run.

So, how did DNA.Land do with imputing data as compared with actual tested results?

# Results % Comment
Tested, not Imputed 171 38.6 This “bad” location was tested at FTDNA and uploaded, so we know it was reported accurately at DNA.Land and not imputed.
Total Imputed* 272 61.4 Meaning total of “bad” results not tested at FTDNA, so not uploaded to DNA.Land, therefore imputed.
Imputed Correctly 259 95.22 This result was verified to match a tested location in the combined run.
Imputed, but not tested elsewhere 6 2.21 Accuracy cannot be confirmed.
Conflict 3 1.10 DNA.Land results cannot be verified due to an error of some sort – two of these three are probably accurate.
Imputed Incorrectly 4 1.47 Confirmed by the combined run where the location was actually tested at multiple vendor(s).
Not reported, and should have been 1 0.37 4 other vendor tests showed this mutation, including FTDNA which was uploaded to DNA.Land. Therefore these locations should have been reported by the DNA.Land file.

*The total number of “bad” results was 443, 171 that were tested and 272 that were imputed. Note that the percentages of imputations shown below the “Total Imputed” number of 272 are calculated based on the number of locations imputed, not on the total number of locations reported.

Concerns, Conflicts and Errors

It’s worth noting that my highest imputed “bad” risk from DNA.Land was not tested elsewhere, so cannot be verified, which concerns me.

On the three results where a conflict exists, all 3 locations were tested at multiple other vendors, and the results at the other vendors where the results were actually tested show different results from each other, which means that the DNA.Land result cannot be verified as accurate. Clearly, an error exists in at least one of the other tests.

In one conflict case, this error has occurred at 23andMe on either their V3 or V4 chip, where the results do not match each other.

In a second conflict case, two of the other vendors agree and the DNA.Land imputation is likely accurate, as it matches 2 of the three other vendor tests.

In the third conflict case, the Ancestry V2 test confirms one of the 23andMe results, which matches the DNA.Land results, so the DNA.Land result is likely accurate.

Of the 4 results that were confirmed to be imputed incorrectly, all locations were tested at multiple vendors. In two cases, the location was confirmed on two other tests and in the other two cases, the location was tested at three vendors. The testing vendor’s results all matched each other.

Summary

Overall, given the problems found with both DNA.Land and MyHeritage, who both impute, relative to genetic genealogy matching, I was surprised to find that the DNA.Land imputed health results were relatively accurate.

I expected the locations reported in the FTDNA file to be reported accurately by DNA.Land, because that data was provided to them. In one case, it was not.

Of the 272 “bad” results imputed, 259, or 95.22% could be verified as accurate.

Six could not be verified, and three were in conflict, but of those, it’s likely that two of the three were imputed accurately by DNA.Land. The third can’t be verified. This totals 3.31% of the imputed results that are ambiguous.

Only 1.47% were imputed incorrectly. If you add the .37% for the location that was not reported and should have been, and make the leap of assumption that the one of three in conflict is in error, DNA.Land is still just over a 2% confirmed error rate.

I can see why Illumina would represent to the vendors that imputation technology is “very accurate.” “Very” of course is relative, pardon the pun, in genetic genealogy, to how well matching occurs, not only when the new GSA chip is compared to another GSA chip, but when the new GSA version is compared to the older OmniExpress version. For backards compatibility between the chip versions, imputation must be utilized. Thanks a lot Illumina (said in my teenage sarcastic voice).

Since DNA.Land accepts files from all the vendors on all chips, for DNA.Land to be able to compare all locations in all vendors’ files against each other, the “missing” data in each file must be imputed. MyHeritage is doing something similar (having hired one of the DNA.Land developers), and both vendors have problems with genetic genealogy matching.

This begs the question of why the matching is demonstrably so poor for genetic genealogy. I’ve written about this phenomenon here, Kitty Cooper wrote about it here and Leah Larkin here.

Based on this comparison, each individual DNA.Land imputed file would contain about a 2% error rate of incorrectly imputed data, assuming the error rate is the same across the entire file, so a combined total of 4% for two individuals, if you’re just looking at individual SNPs. Perhaps entire segments are being imputed incorrectly, given that we know that DNA is inherited in segments. If that is the case, and these individual SNPs are simply small parts of entire segments that are imputed incorrectly, they might account for an equal number of false positive matches. In other words, if 10 segments are imputed incorrectly for me, that’s 10 segments reporting false positive matches I’ll have when paired against anyone who receives the same imputed data. However, that doesn’t explain the matches that are legitimate (on tested segments) and aren’t found by the imputing vendors, and it doesn’t explain an erroneous match rate that appears to be significantly higher than the 2-4% per cent found in this comparison.

I’ll be writing about the DNA.Land matching comparison experience shortly.

I would strongly prefer that medical research be performed on fully tested individuals. I realize that the cost of encouraging consumers to upload their data, and then imputing additional information is much less expensive than actual testing. However, accuracy is an issue and a 2% error rare, if someone is dealing with life-saving and life-threatening research could be a huge margin of error, from the beginning of the project, based on faulty imputation – which could be eliminated by simply testing people. This seems like an unnecessary risk and faulty research just waiting to happen. This error rate is on top of the actual sequencing error rate, but sequencing errors will be found in different locations in individuals, not on the same imputed segment assigned to multiple people in population groups. Imputation errors could be cumulative in one location, appearing as a hot spot when in reality, it’s an imputation error.

As related to genetic genealogy, I don’t think imputation and genetic genealogy are good bedfellows. DNA.Land’s matching was even worse when it was initially introduced, which is one reason I’ve waited so long to upload and write about the service.

Unfortunately, with Illumina obsoleting the OmniExpress chip, we’re not going to have a choice, sooner than later. All vendors who utilized the OmniExpress chip are being forced off, either onto the GSA chip or to an Exome or full sequence chip. The cost of sequencing for anything other than the GSA chip is simply more than the genetic genealogy market will stand, not to mention even larger compatibility issues. My Genos Exome test cost $499 just a few months ago and still sells for that price today.

The good news is that utilizing imputation, we will still receive matches, just less accurate matches when comparing the new chip to older versions, and when using imputation.

New testers will never know the difference. Testers not paying close attention won’t notice or won’t realize either. That leaves the rest of us “old timers” who want increased accuracy and specification, not less, flapping in the wind along with the vendors who don’t sell our test results into the medical arena and have no reason to move to the new GSA platform other than Illumina obsoleting the OmniExpress chip.

Like I said, thanks Illumina.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research