MyHeritage LIVE Conference Day 2 – The Science Behind DNA Matching    

The MyHeritage LIVE Oslo conference is but a fond memory now, and I would count it as a resounding success.

Perhaps one of the reasons I enjoyed it so much is the scientific aspect and because the content is very focused on a topic I enjoy without being the size and complexity of Rootstech. The smaller, more intimate venue also provides access to the “right” people as well as the ability to meet other attendees and not be overwhelmed by the sheer size.

Here are some stats:

  • 401 registered guests
  • 28 countries represented including distant places like Australia and South America
  • More than 20 speakers plus the hands-on workshops where specialist teams worked with students
  • 38 sessions and workshops, plus the party
  • 60,000 livestream participants, in spite of the time differences around the world

I was blown away by the number of livestream attendees.

I don’t know what criteria Gilad Japhet will be using to determine “success” but I can’t imagine this conference being judged as anything but.

Let’s take a look at the second day. I spent part of the time talking to people and drifting in and out of the rear of several sessions for a few minutes. I meant to visit some of the workshops, but there was just too much good, distracting content elsewhere.

I began Sunday in Mike Mansfield’s presentation about SuperSearch. Yes, I really did attend a few sessions not about DNA, but my favorite was the session on Improved DNA Matching.

Improved DNA Matching

I’m sure it won’t surprise any of my readers that my favorite presentations were about the actual science of genetic genealogy.

Consumers don’t really need to understand the science behind autosomal results to reap the benefits, but the underlying science is part of what I love – and it’s important for me to understand the underpinnings to be able to unravel the fine points of what the resulting matches are and are not revealing. Misinterpretation of DNA results leading to faulty conclusions is a real issue in genetic genealogy today. Consequently, I feel that anyone working with other people’s results and providing advice really needs to understand how the science and technology together works.

Dr. Daphna Weissglas-Volkov, a population geneticist by training, although she clearly functions far beyond that scope today, gave a very interesting presentation about how MyHeritage handles (their greatly improved) DNA Matching. I’m hitting the high points here, but I would strongly encourage you to watch the video of this session when they are made available online.

In addition to Dr. Weissglas-Volkov’s slides, I’ve added some additional explanations and examples in various places. You can easily tell that the slides are hers and the graphics that aren’t MyHeritage slides are mine.

Dr. Weissglas-Volkov began the session by introducing the MyHeritage science team and then explaining terminology to set the stage.

A match is when two people match each other on a fairly long piece of DNA. Of course, “fairly long” is defined differently by each vendor.

Your genetic map (of your chromosomes) is comprised of the DNA you inherit from different ancestors by the process of recombination when DNA is transferred from the parents to the child. A centiMorgan is the relatively likelihood that a recombination will occur in a single generation. On average, 36 recombinations occur in each generation, meaning that the DNA is divided on any chromosome. However, women, for reasons unknown have about 1.5 times as many recombinations as men.

You can’t see that when looking at an example of a person compared to their parents, of course, because each individual is a full match to each parent, but you can see this visually when comparing a grandchild to their maternal grandmother and their paternal grandmother on a chromosome browser.

The above illustration is the same female grandchild compared to her maternal grandmother, at left, and her paternal grandmother at right. Therefore the number of crossovers at left is through a female child (her mother), and the number at right is through a male child (her father.)

# of Crossovers
Through female child – left 57
Through male child – right 22

There are more segments at left, through the mother, and the segments are generally shorter, because they have been divided into more pieces.

At right, fewer and larger segments through the father.

Keep in mind that because you have a strand of DNA from each parent, with exactly the same “street addresses,” that what is produced by DNA sequencing are two columns of data – but your Mom’s and Dad’s DNA is intermixed.

The information in the two columns can’t be identified as Mom’s or Dad’s DNA or strand at this point.

That interspersed raw data is called a genotype. A haplotype is when Mom’s and Dad’s DNA can be reassembled into “sides” so you can attribute the two letters at each address to either Mom or Dad.

Here’s a quick example.

The goal, of course, is to figure out how to reassemble your DNA into Mom’s side and Dad’s side so that we know that someone matching you is actually matching on all As (Mom) or all Gs (Dad,) in this example, and not a false match that zigzags back and forth between Mom and Dad.

The best way to accomplish that goal of course is trio phasing, when the child and both parents are available, so by comparing the child’s DNA with the parents you can assign the two strands of the child’s DNA.

Unfortunately, few people have both or even one parent available in order to actual divide their DNA into “sides,” so the next best avenue is statistical phasing. I’ve called this academic phasing in the past, as compared to parental phasing which MyHeritage refers to as trio phasing.

There’s a huge amount of confusion about phasing, with few people understanding there are two distinct types.

Statistical phasing is a type of machine learning where a large number of reference populations are studied. Since we know that DNA travels together in blocks when inherited, statistical phasing learns which DNA travels with which buddy DNA – and creates probabilities. Your DNA is then compared to these models and your DNA is reshuffled in order to assemble your DNA into two groups – one representing your Mom’s DNA and one representing your Dad’s DNA, according to statistical probability.

Looking at your genotype, if we know that As group together at those 6 addresses in my example 95% of the time, then we know that the most likely scenario to create a haplotype is that all of the As came from one parent and all of the Gs from the other parent – although without additional information, there is no way to yet assign the maternal and paternal identifier. At this point, we only know parent 1 and parent 2.

In order to train the computers (machine learning) to properly statistically phase testers’ results, MyHeritage uses known relationships of people to teach the machines. In other words, their reference panels of proven haplotypes grows all of the time as parent/child trios test.

Dr. Weissglas-Volkev then moved on to imputation.

When sequencing DNA, not every location reads accurately, so the missing values can be imputed, or “put back” using imputation.

Initially imputation was a hot mess. Not just for MyHeritage, but for all vendors, imputation having been forced upon them (and therefore us) by Illumina’s change to the GSA chip.

However, machine learning means that imputation models improve constantly, and matching using imputation is greatly improved at MyHeritage today.

Imputation can do more than just fill in blanks left by sequencing read errors.

The benefit of imputation to the genetic genealogy community is that vendors using disparate chips has forced vendors that want to allow uploads to utilize imputation to create a global template that incorporates all of the locations from each vendor, then impute the values they don’t actually test for themselves to complete the full template for each person.

In the example below, you can see that no vendor tests all available locations, but when imputation extends the sequences of all testers to the full 1-500 locations, the results can easily be compared to every other tester because every tester now has values in locations 1-500, regardless of which vendor/chip was utilized in their actual testing.

Therefore, using imputation, MyHeritage is able to match between quite disparate chips, such as the traditional Illumina chips (OmniExpress), the custom Ancestry chip and the new GSA chip utilized by 23andMe and LivingDNA.

So, how are matches determined?

Matching

First your DNA and that of another person are scanned for nearly identical seed sequences.

A minimum segment length of 6cM must be identified for further match processing to occur. Anything below 6cM is discarded at this point.

The match is then further evaluated to see if the seed match is of a high enough quality that it should be perfected and should count as a match. Other segments continue to be evaluated as well. If the total matching segment(s) is 8 total cM or greater, it’s considered a valid match. MyHeritage has taken the position that they would rather give you a few accidental false matches than to miss good matches. I appreciate that position.

Window cleaning is how they refer to the process of removing pileup regions known to occur in the human genome. This is NOT the same as Ancestry’s routine that removes areas they determine to be “too matchy” for you individually.

The difference is that in humans, for example, there is a segment of chromosome 6 where, for some reason, almost all humans match. Matching across that segment is not informative for genetic genealogy, so that region along with several others similar in nature are removed. At Ancestry, those genome-wide pileup segments are removed, along with other regions where Ancestry decides that you personally have too many matches. The problem is that for me, these “too matchy” segments are many of my Acadian matches. Acadians are endogamous, so lots of them match each other because as a small intermarried population, they share a great deal of the same DNA. However, to me, because I have one great-grandfather that’s Acadian, that “too matchy” information IS valuable although I understand that it wouldn’t be for someone that is 100% Acadian or Jewish.

In situations such as Ashkenazi Jewish matching, which is highly endogamous, MyHeritage uses a higher matching threshold. Otherwise every Ashkenazi person would match every other Ashkenazi person because they all descend from a small founder population, and for genealogy, that’s not useful.

The last step in processing matches is to establish the confidence level that the match is accurately predicted at the correct level – meaning the relationship range based on the amount of matching DNA and other criteria.

For example, does this match cluster with other proven matches of the same known relationship level?

From several confidence ascertainment steps, a confidence score is assigned to the predicted relationship.

Of course, you as a customer see none of this background processing, just the fact that you do match, the size of the match and the confidence score. That’s what genealogists need!

Matching Versus Triangulation Thresholds

Confusion exists about matching thresholds versus triangulation thresholds.

While any single segment must be over 6 cM in length for the matching process to begin, the actual match threshold at MyHeritage is a total of 8 cM.

I took a look at my lowest match at MyHeritage.

I have two segments, one 6.1 cM segment, and one 6 cM segment that match. It would appear that if I only had one 6 cM segment, it would not show as a match because I didn’t have the minimum 8 cM total.

Triangulation Threshold

However, after you pass that matching criteria and move on to triangulation with a matching individual, you have the option of selecting the triangulation threshold, which is not the same thing as the match threshold. The match threshold does not change, but you can change the triangulation threshold from 2 cM to 8 cM and selections in-between.

In the example below, I’m comparing myself against two known relatives.

You won’t be shown any matches below the 6 cM individual segment threshold, BUT you can view triangulated segments of different sizes. This is because matching segments often don’t line up exactly and the triangulated overlap between several individuals may be very small, but may still be useful information.

Flying your mouse over the location in the bubble, which is the triangulated segment, tells you the size of the triangulated portion. If you selected the 2 cM triangulation, you would see smaller triangulated portions of matches.

Closing Session

The conference was closed by Aaron Godfrey, a super-nice MyHeritage employee from the UK. The closing session is worth watching on the recorded livestream when it becomes available, in part because there are feel good moments.

However, the piece of information I was looking for was whether there will be a MyHeritage LIVE conference in 2019, and if so, where.

I asked Gilad afterwards and he said that they will be evaluating the feedback from attendees and others when making that decision.

So, if you attended or joined the livestream sessions and found value, please let MyHeritage know so that they can factor your feedback onto their decision. If there are topics you’d like to see as sessions, I’m sure they’d love to hear about that too. Me, I’m always voting for more DNA😊

I hope to hear about MyHeritage LIVE 2019, and I’m voting for any of the following locations:

  • Australia
  • New Zealand
  • Israel
  • Germany
  • Switzerland

What do you think?

Elizabeth Warren’s Native American DNA Results: What They Mean

Elizabeth Warren has released DNA testing results after being publicly challenged and derided as “Pochahontas” as a result of her claims of a family story indicating that her ancestors were Native America. If you’d like to read the specifics of the broo-haha, this Washington Post Article provides a good summary, along with additional links.

I personally find name-calling of any type unacceptable behavior, especially in a public forum, and while Elizabeth’s DNA test was taken, I presume, in an effort to settle the question and end the name-calling, what it has done is to put the science of genetic testing smack dab in the middle of the headlines.

This article is NOT about politics, it’s about science and DNA testing. I will tell you right up front that any comments that are political or hateful in nature will not be allowed to post, regardless of whether I agree with them or not. Unfortunately, these results are being interpreted in a variety of ways by different individuals, in some cases to support a particular political position. I’m presenting the science, without the politics.

This is the first of a series of two articles.

I’m dividing this first article into four sections, and I’d ask you to read all four, especially before commenting. A second article, Possibilities – Wringing the Most Out of Your DNA Ethnicity Test will follow shortly about how to get the most out of an ethnicity test when hunting for Native American (or other minority, for you) ethnicity.

Understanding how the science evolved and works is an important factor of comprehending the results and what they actually mean, especially since Elizabeth’s are presented in a different format than we are used to seeing. What a wonderful teaching opportunity.

  • Family History and DNA Science – How this works.
  • Elizabeth Warren’s Genealogy
  • Elizabeth Warren’s DNA Results
  • Questions and Answers – These are the questions I’m seeing, and my science-based answers.

My second article, Possibilities – Wringing the Most Out of Your DNA Ethnicity Test will include:

  • Potential – This isn’t all that can be done with ethnicity results. What more can you do to identify that Native ancestor?
  • Resources with Step by Step Instructions

Now, let’s look at Elizabeth’s results and how we got to this point.

Family Stories and DNA

Every person that grows up in their biological family hears family stories. We have no reason NOT to believe them until we learn something that potentially conflicts with the facts as represented in the story.

In terms of stories handed down for generations, all we have to go on, initially, are the stories themselves and our confidence in the person relating the story to us. The day that we begin to suspect that something might be amiss, we start digging, and for some people, that digging begins with a DNA test for ethnicity.

My family had that same Cherokee story. My great-grandmother on my father’s side who died in 1918 was reportedly “full blooded Cherokee” 60 years later when I discovered she had existed. Her brothers reportedly went to Oklahoma to claim headrights land. There were surely nuggets of truth in that narrative. Family members did indeed to go Oklahoma. One did own Cherokee land, BUT, he purchased that land from a tribal member who received an allotment. I discovered that tidbit later.

What wasn’t true? My great-grandmother was not 100% Cherokee. To the best of my knowledge now, a century after her death, she wasn’t Cherokee at all. She probably wasn’t Native at all. Why, then, did that story trickle down to my generation?

I surely don’t know. I can speculate that it might have been because various people were claiming Native ancestry in order to claim land when the government paid tribal members for land as reservations were dissolved between 1893 and 1914. You can read more about that in this article at the National Archives about the Dawes Rolls, compiled for the Cherokee, Creek, Choctaw, Chickasaw and Seminole for that purpose.

I can also speculate that someone in the family was confused about the brother’s land ownership, especially since it was Cherokee land.

I could also speculate that the confusion might have resulted because her husband’s father actually did move to Oklahoma and lived on Choctaw land.

But here is what I do know. I believed that story because there wasn’t any reason NOT to believe it, and the entire family shared the same story. We all believed it…until we discovered evidence through DNA testing that contradicted the story.

Before we discuss Elizabeth Warren’s actual results, let’s take a brief look at the underlying science.

Enter DNA Testing

DNA testing for ethnicity was first introduced in a very rudimentary form in 2002 (not a typo) and has progressed exponentially since. The major vendors who offer tests that provide their customers with ethnicity estimates (please note the word estimates) have all refined their customer’s results several times. The reference populations improve, the vendor’s internal software algorithms improve and population genetics as a science moves forward with new discoveries.

Note that major vendors in this context mean Family Tree DNA, 23andMe, the Genographic Project and Ancestry. Two newer vendors include MyHeritage and LivingDNA although LivingDNA is focused on England and MyHeritage, who utilizes imputation is not yet quite up to snuff on their ethnicity estimates. Another entity, GedMatch isn’t a testing vendor, but does provide multiple ethnicity tools if you upload your results from the other vendors. To get an idea of how widely the results vary, you can see the results of my tests at the different vendors here and here.

My initial DNA ethnicity test, in 2002, reported that I was 25% Native American, but I’m clearly not. It’s evident to me now, but it wasn’t then. That early ethnicity test was the dinosaur ages in genetic genealogy, but it did send me on a quest through genealogical records to prove that my family member was indeed Native. My father clearly believed this, as did the rest of the family. One of my early memories when I was about four years old was attending a (then illegal) powwow with my Dad.

In order to prove that Elizabeth Vannoy, that great-grandmother, was Native I asked a cousin who descends from her matrilineally to take a mitochondrial DNA test that would unquestionably provide the ethnicity of her matrilineal line – that of her mother’s mother’s mother’s direct line. If she was Native, her haplogroup would be a derivative either A, B, C, D or X. Her mitochondrial DNA was European, haplogroup J, clearly not Native, so Elizabeth Vannoy was not Native on that line of her family. Ok, maybe through her dad’s line then. I was able to find a Vanoy male descendant of her father, Joel Vannoy, to test his Y DNA and he was not Native either. Rats!

Tracking Elizabeth Vannoy’s genealogy back in time provided no paper-trail link to any Native ancestors, but there were and are still females whose surnames and heritage we don’t know. Were they Native or part Native? Possibly. Nothing precludes it, but nothing (yet) confirms it either.

Unexpected Results

DNA testing is notorious for unveiling unexpected results. Adoptions, unknown parents, unexpected ethnicities, previously unknown siblings and half-siblings and more.

Ethnicity is often surprising and sometimes disappointing. People who expect Native American heritage in their DNA sometimes don’t find it. Why?

  • There is no Native ancestor
  • The Native DNA has “washed out” over the generations, but they did have a Native ancestor
  • We haven’t yet learned to recognize all of the segments that are Native
  • The testing company did not test the area that is Native

Not all vendors test the same areas of our DNA. Each major company tests about 700,000 locations, roughly, but not the same 700,000. If you’re interested in specifics, you can read more about that here.

50-50 Chance

Everyone receives half of their autosomal DNA from each parent.

That means that each parent contributes only HALF OF THEIR DNA to a child. The other half of their DNA is never passed on, at least not to that child.

Therefore, ancestral DNA passed on is literally cut in half in each generation. If your parent has a Native American DNA segment, there is a 50-50 chance you’ll inherit it too. You could inherit the entire segment, a portion of the segment, or none of the segment at all.

That means that if you have a Native ancestor 6 generations back in your tree, you share 1.56% of their DNA, on average. I wrote the article, Ancestral DNA Percentages – How Much of Them is in You? to explain how this works.

These calculations are estimates and use averages. Why? Because they tell us what to expect, on average. Every person’s results will vary. It’s entirely possible to carry a Native (or other ethnic) segment from 7 or 8 or 9 generations ago, or to have none in 5 generations. Of course, these calculations also presume that the “Native” ancestor we find in our tree was fully Native. If the Native ancestor was already admixed, then the percentages of Native DNA that you could inherit drop further.

Why Call Ethnicity an Estimate?

You’ve probably figured out by now that due to the way that DNA is inherited, your ethnicity as reported by the major testing companies isn’t an exact science. I discussed the methodology behind ethnicity results in the article, Ethnicity Testing – A Conundrum.

It is, however, a specialized science known as Population Genetics. The quality of the results that are returned to you varies based on several factors:

  • World Region – Ethnicity estimates are quite accurate at the continental level, plus Jewish – meaning African, Indo-European, Asian, Native American and Jewish. These regions are more different than alike and better able to be separated.
  • Reference Population – The size of the population your results are being compared to is important. The larger the reference population, the more likely your results are to be accurate.
  • Vendor Algorithm – None of the vendors provide the exact nature of their internal algorithms that they use to determine your ethnicity percentages. Suffice it to say that each vendor’s staff includes population geneticists and they all have years of experience. These internal differences are why the estimates vary when compared to each other.
  • Size of the Segment – As with all genetic genealogy, bigger is better because larger segments stand a better chance of being accurate.
  • Academic Phasing – A methodology academics and vendors use in which segments of DNA that are known to travel together during inheritance are grouped together in your results. This methodology is not infallible, but in general, it helps to group your mother’s DNA together and your father’s DNA together, especially when parents are not available for testing.
  • Parental Phasing – If your parents test and they too have the same segment identified as Native, you know that the identification of that segment as Native is NOT a factor of chance, where the DNA of each of your parents just happens to fall together in a manner as to mimic a Native segment. Parental phasing is the ability to divide your DNA into two parts based on your parent’s DNA test(s).
  • Two Chromosomes – You have two chromosomes, one from your mother and one from your father. DNA testing can’t easily separate those chromosomes, so the exact same “address” on your mother’s and father’s chromosomes that you inherited may carry two different ethnicities. Unless your parents are both from the same ethnic population, of course.

All of these factors, together, create a confidence score. Consumers never see these scores as such, but the vendors return the highest confidence results to their customers. Some vendors include the capability, one way or another, to view or omit lower confidence results.

Parental Phasing – Identical by Descent

If you’re lucky enough to have your parents, or even one parent available to test, you can determine whether that segment thought to be Native came from one of your parents, or if the combination of both of your parent’s DNA just happened to combine to “look” Native.

Here’s an example where the “letters” (nucleotides) of Native DNA for an example segment are shown at left. If you received the As from one of your parents, your DNA is said to be phased to that parent’s DNA. That means that you in fact inherited that piece of your DNA from your mother, in the case shown below.

That’s known as Identical by Descent (IBD). The other possibility is what your DNA from both of your parents intermixed to mimic a Native segment, shown below.

This is known as Identical by Chance (IBC).

You don’t need to understand the underpinnings of this phenomenon, just remember that it can happen, and the smaller the segment, the more likely that a chance combination can randomly happen.

Elizabeth Warren’s Genealogy

Elizabeth Warren’s genealogy, is reported to the 5th generation by WikiTree.

Elizabeth’s mother, Pauline Herring’s line is shown, at WikiTree, as follows:

Notice that of Elizabeth Warren’s 16 great-great-great grandparents on her mother’s side, 9 are missing.

Paper trail being unfruitful, Elizabeth Warren, like so many, sought to validate her family story through DNA testing.

Elizabeth Warren’s DNA Results

Elizabeth Warren didn’t test with one of the major vendors. Instead, she went directly to a specialist. That’s the equivalent of skipping the family practice doctor and going to the Mayo Clinic.

Elizabeth Warren had test results interpreted by Dr. Carlos Bustamante at Stanford University. You can read the actual report here and I encourage you to do so.

From the report, here are Dr. Bustamante’s credentials:

Dr. Carlos D. Bustamante is an internationally recognized leader in the application of data science and genomics technology to problems in medicine, agriculture, and biology. He received his Ph.D. in Biology and MS in Statistics from Harvard University (2001), was on the faculty at Cornell University (2002-9), and was named a MacArthur Fellow in 2010. He is currently Professor of Biomedical Data Science, Genetics, and (by courtesy) Biology at Stanford University. Dr. Bustamante has a passion for building new academic units, non-profits, and companies to solve pressing scientific challenges. He is Founding Director of the Stanford Center for Computational, Evolutionary, and Human Genomics (CEHG) and Inaugural Chair of the Department of Biomedical Data Science. He is the Owner and President of CDB Consulting, LTD. and also a Director at Eden Roc Biotech, founder of Arc-Bio (formerly IdentifyGenomics and BigData Bio), and an SAB member of Imprimed, Etalon DX, and Digitalis Ventures among others.

He’s no lightweight in the study of Native American DNA. This 2012 paper, published in PLOS Genetics, Development of a Panel of Genome-Wide Ancestry Informative Markers to Study Admixture Throughout the Americas focused on teasing out Native American markers in admixed individuals.

From that paper:

Ancestry Informative Markers (AIMs) are commonly used to estimate overall admixture proportions efficiently and inexpensively. AIMs are polymorphisms that exhibit large allele frequency differences between populations and can be used to infer individuals’ geographic origins.

And:

Using a panel of AIMs distributed throughout the genome, it is possible to estimate the relative ancestral proportions in admixed individuals such as African Americans and Latin Americans, as well as to infer the time since the admixture process.

The methodology produced results of the type that we are used to seeing in terms of continental admixture, shown in the graphic below from the paper.

Matching test takers against the genetic locations that can be identified as either Native or African or European informs us that our own ancestors carried the DNA associated with that ethnicity.

Of course, the Native samples from this paper were focused south of the United States, but the process is the same regardless. The original Native American population of a few individuals arrived thousands of years ago in one or more groups from Asia and their descendants spread throughout both North and South America.

Elizabeth’s request, from the report:

To analyze genetic data from an individual of European descent and determine if there is reliable evidence of Native American and/or African ancestry. The identity of the sample donor, Elizabeth Warren, was not known to the analyst during the time the work was performed.

Elizabeth’s test included 764,958 genetic locations, of which 660,173 overlapped with locations used in ancestry analysis.

The Results section says after stating that Elizabeth’s DNA is primarily (95% or greater) European:

The analysis also identified 5 genetic segments as Native American in origin at high confidence, defined at the 99% posterior probability value. We performed several additional analyses to confirm the presence of Native American ancestry and to estimate the position of the ancestor in the individual’s pedigree.

The largest segment identified as having Native American ancestry is on chromosome 10. This segment is 13.4 centiMorgans in genetic length, and spans approximately 4,700,000 DNA bases. Based on a principal components analysis (Novembre et al., 2008), this segment is clearly distinct from segments of European ancestry (nominal p-value 7.4 x 10-7, corrected p-value of 2.6 x 10-4) and is strongly associated with Native American ancestry.

The total length of the 5 genetic segments identified as having Native American ancestry is 25.6 centiMorgans, and they span approximately 12,300,000 DNA bases. The average segment length is 5.8 centiMorgans. The total and average segment size suggest (via the method of moments) an unadmixed Native American ancestor in the pedigree at approximately 8 generations before the sample, although the actual number could be somewhat lower or higher (Gravel, 2012 and Huff et al., 2011).

Dr. Bustamante’s Conclusion:

While the vast majority of the individual’s ancestry is European, the results strongly support the existence of an unadmixed Native American ancestor in the individual’s pedigree, likely in the range of 6-10 generations ago.

I was very pleased to see that Dr. Bustamante had included the PCA (Principal Component Analysis) for Elizabeth’s sample as well.

PCA analysis is the scientific methodology utilized to group individuals to and within populations.

Figure one shows the section of chromosome 10 that showed the largest Native American haplotype, meaning DNA block, as compared to other populations.

Remember that since Elizabeth received a chromosome from BOTH parents, that she has two strands of DNA in that location.

Here’s our example again.

Given that Mom’s DNA is Native, and Dad’s is European in this example, the expected results when comparing this segment of DNA to other populations is that it would look half Native (Mom’s strand) and half European (Dad’s strand.)

The second graphic shows Elizabeth’s sample and where it falls in the comparison of First Nations (Canada) and Indigenous Mexican individuals. Given that Elizabeth’s Native ancestor would have been from the United States, her sample falls where expected, inbetween.

Let’s take a look at some of the questions being asked.

Questions and Answers

I’ve seen a lot of misconceptions and questions regarding these results. Let’s take them one by one:

Question – Can these results prove that Elizabeth is Cherokee?

Answer – No, there is no test, anyplace, from any lab or vendor, that can prove what tribe your ancestors were from. I wrote an article titled Finding Your American Indian Tribe Using DNA, but that process involves working with your matches, Y and mitochondrial DNA testing, and genealogy.

Q – Are these results absolutely positive?

A – The words “absolutely positive” are a difficult quantifier. Given the size of the largest segment, 13.4 cM, and that there are 5 Native segments totaling 25.6 cM, and that Dr. Bustamante’s lab performed the analysis – I’d say this is as close to “absolutely positive” as you can get without genealogical confirmation.

A 13.4 cM segment is a valid segment that phases to parents 98% of the time, according to Philip Gammon’s work, here, and 99% of the time in my own analysis here. That indicates that a 13.4 cM segment is very likely a legitimately ancestral segment, not a match by chance. The additional 4 segments simply increase the likelihood of a Native ancestor. In other words, for there NOT to be a Native ancestor, all 5 segments, including the large 13.4 cM segment would have to be misidentified by one of the premier scientists in the field.

Q – What did Dr. Bustamante mean by “evidence of an unadmixed Native American ancestor?”

A – Unadmixed means that the Native person was fully Native, meaning not admixed with European, Asian or African DNA. Admixture, in this context, means that the individual is a mixture of multiple ethnic groups. This is an important concept, because if you discover that your ancestor 4 generations ago was a Cherokee tribal member, but the reality was that they were only 25% Native, that means that the DNA was already in the process of being divided. If your 4th generation ancestor was fully Native, you would receive about 6.25% of their DNA which would be all Native. If they were only 25% Native, that means that while you will still receive about 6.25% of their DNA but only one fourth of that 6.25% is possibly Native – so 1.56%. You could also receive NONE of their Native DNA.

Q – Is this the same test that the major companies use?

A – Yes and no. The test itself was probably performed on the same Illumina chip platform, because the chips available cover the markers that Bustamante needed for analysis.

The major companies use the same reference data bases, plus their own internal or private data bases in addition. They do not create PCA models for each tester. They do use the same methodology described by Dr. Bustamante in terms of AIMs, along with proprietary algorithms to further define the results. Vendors may also use additional internal tools.

Q – Did Dr. Bustamante use more than one methodology in his analysis? What if one was wrong?

A – Yes, he utilized two different methodologies whose results agreed. The global ancestry method evaluates each location independently of any surrounding genetic locations, ignoring any correlation or relationship to neighboring DNA. The second methodology, known as the local ancestry method looks at each location in combination with its neighbors, given that DNA pieces are known to travel together. This second methodology allows comparisons to entire segments in reference populations and is what allows the identification of complete ancestral segments that are identified as Native or any other population.

Q – If Elizabeth’s DNA results hadn’t shown Native heritage, would that have proven that she didn’t have Native ancestry?

A – No, not definitively, although that is a possible reason for ethnicity results not showing Native admixture. It would have meant that either she didn’t have a Native ancestor, the DNA washed out, or we cannot yet detect those segments.

Q – Does this qualify Elizabeth to join a tribe?

A – No. Every tribe defines their own criteria for membership. Some tribes embrace DNA testing for paternity issues, but none, to the best of my knowledge, accept or rely entirely on DNA results for membership. DNA results alone cannot identify a specific tribe. Tribes are societal constructs and Native people genetically are more alike than different, especially in areas where tribes lived nearby, fought and captured other tribe’s members.

Q – Why does Dr. Bustamante use words like “strong probability” instead of absolutes, such as the percentages shown by commercial DNA testing companies?

A – Dr. Bustamante’s comments accurately reflect the state of our knowledge today. The vendors attempt to make the results understandable and attractive for the general population. Most vendors, if you read their statements closely and look at your various options indicate that ethnicity is only an estimate, and some provide the ability to view your ethnicity estimate results at high, medium and low confidence levels.

Q – Can we tell, precisely, when Elizabeth had a Native ancestor?

A – No, that’s why Dr. Bustamante states that Elizabeth’s ancestor was approximately 8 generations ago, and in the range of 6-10 generations ago. This analysis is a result of combined factors, including the total centiMorgans of Native DNA, the number of separate reasonably large segments, the size of the longest segment, and the confidence score for each segment. Those factors together predict most likely when a fully Native ancestor was present in the tree. Keep in mind that if Elizabeth had more than one Native ancestor, that too could affect the time prediction.

Q – Does Dr. Bustamante provide this type of analysis or tools for the general public?

A – Unfortunately, no. Dr. Bustamante’s lab is a research facility only.

Roberta’s Summary of the Analysis

I find no omissions or questionable methods and I agree with Dr. Bustamante’s analysis. In other words, yes, I believe, based on these results, that Elizabeth had a Native ancestor further back in her tree.

I would love for every tester to be able to receive PCA results like this.

However, an ethnicity confirmation isn’t all that can be done with Elizabeth’s results. Additional tools and opportunities are available outside of an academic setting, at the vendors where we test, using matching and other tools we have access to as the consuming public.

We will look at those possibilities in a second article, because Elizabeth’s results are really just a beginning and scratch the surface. There’s more available, much more. It won’t change Elizabeth’s ethnicity results, but it could lead to positively identifying the Native ancestor, or at least the ancestral Native line.

Join me in my next article for Possibilities, Wringing the Most Out of Your DNA Ethnicity Test.

In the mean time, you might want to read my article, Native American DNA Resources.

MyHeritage Step by Step Guide: How to Upload-Download DNA Files

In this Upload-Download Series, we’ll cover each major vendor:

  • How to download raw data files from the vendor
  • How to upload raw data files to the vendor, if possible
  • Other mainstream vendors where you can upload this vendor’s files

Uploading TO MyHeritage

Upload Step 1

To upload your DNA to MyHeritage, click here and then click on the purple “Start” button.

Upload Step 1 If You Already Have an Account at MyHeritage

If you already have an account, click here to sign in and then click on the DNA tab to display the “Upload DNA Data” option which displays the graphic above. Click on the purple “Start” button. This is the same process you’ll use whether it’s the first time you’ve uploaded a kit, or you’re uploading subsequent kits to your account that you’ll be managing.

Upload Step 2

You’ll be prompted to create a free account by entering your name, e-mail and password, and from there you can upload your autosomal DNA file.

You’ll be asked whose DNA you’re uploading and prompted to read and agree to the terms of service and consent.

Click the purple upload button.

Then click done when the file is finished uploading.

You’ll be notified by e-mail within a couple days when the file is finished processing.

Downloading FROM MyHeritage

Download Step 1

Sign on to your MyHeritage account.

Click on DNA on the upper toolbar.

The dropdown menu includes “Manage DNA Kits”

Download Step 2

At the right of the kit you wish to download, click on the three small buttons which will include an option for “Download,” as shown in the graphics below from the MyHeritage blog article.

Download Step 3

You’ll be presented with a box titled “Learn more about DNA data files.” Click the purple “Continue” button.

Download Step 4

You’ll need to confirm that you want to download your data, and that you understand that the download is outside of MyHeritage and their protection. Click the purple “Continue” button.

Download Step 5

You’ll receive a confirmation e-mail. Click on “Click here to continue with download.”

This e-mail link is only valid for 24 hours.

Download Step 6

Enter your password again, and click on the purple “Download” button.

Download Step 7

Save the file as a recognizable file name on your computer.

MyHeritage File Transfers TO Other Vendors

You can upload your MyHeritage file to other vendors, as follows.

From below to >>>>>>>>>>> Family Tree DNA Accepts Ancestry Accepts 23andMe Accepts GedMatch Accepts
MyHeritage Yes No No Yes

Neither Ancestry nor 23andMe accepts uploads from any vendor.

MyHeritage File Transfers FROM Other Vendors

You can upload files from other vendors to MyHeritage, as follows:

  From Family Tree DNA From Ancestry From 23andMe From LivingDNA
To MyHeritage Yes Yes Yes Yes

Testing and Transfer Strategy

Transferring to MyHeritage is always free. You can view your ethnicity, your matches and their trees, and utilize the DNA tools, but you won’t receive the full benefit of SmartMatching and other records without a subscription. You will be limited to building a tree of 250 people for free, but you can upload a Gedcom file of any size, although you do need to subscribe to change anything in that file if it contains more than 250 individuals.

Until December 1, 2018, all DNA tools will be and remain free for anyone who uploads before that date. After December 1st, matching will remain free, but the advanced tools such as ethnicity, the chromosome browser, triangulation and more will require payment. MyHeritage has not yet indicated how that will work, so upload now to receive free DNA tools forever.

My testing/transfer recommendations are as follows relative to MyHeritage:

Have fun!

_____________________________________________________________________

Standard Disclosure

This standard disclosure appears at the bottom of every article in compliance with the FTC Guidelines.

I provide Personalized DNA Reports for Y and mitochondrial DNA results for people who have tested through Family Tree DNA. I provide Quick Consults for DNA questions for people who have tested with any vendor. I would welcome the opportunity to provide one of these services for you.

Hot links are provided to Family Tree DNA, where appropriate. If you wish to purchase one of their products, and you click through one of the links in an article to Family Tree DNA, or on the sidebar of this blog, I receive a small contribution if you make a purchase. Clicking through the link does not affect the price you pay. This affiliate relationship helps to keep this publication, with more than 900 articles about all aspects of genetic genealogy, free for everyone.

I do not accept sponsorship for this blog, nor do I write paid articles, nor do I accept contributions of any type from any vendor in order to review any product, etc. In fact, I pay a premium price to prevent ads from appearing on this blog.

When reviewing products, in most cases, I pay the same price and order in the same way as any other consumer. If not, I state very clearly in the article any special consideration received. In other words, you are reading my opinions as a long-time consumer and consultant in the genetic genealogy field.

I will never link to a product about which I have reservations or qualms, either about the product or about the company offering the product. I only recommend products that I use myself and bring value to the genetic genealogy community. If you wonder why there aren’t more links, that’s why and that’s my commitment to you.

Thank you for your readership, your ongoing support and for purchasing through the affiliate link if you are interested in making a purchase at Family Tree DNA, or one of the affiliate links below:

Affiliate links are limited to:

Proving or Disproving a Half Sibling Relationship Using DNAPainter

I had this nagging match at MyHeritage for some time who had not responded to messages and who didn’t have a tree. When she did reply, she explained that she was adopted, but I had already been working on how she was related.

Initially, I didn’t think too much of the match, especially when she didn’t reply, but after SmartMatching and Triangulation appeared on the scene, this match haunted me just about daily. Who the heck was Dee? We share enough DNA that we might even share a family resemblance.

Recently, when I became focused on my Dad’s life and (ahem) bad-boy mis-adventures once again, I realized that while this clearly isn’t a half-sibling match, my half-sibling would likely be long-deceased. I was born late in my father’s life and he was breaking hearts 40 years earlier – which means he could also have been fathering children. Dee could be my half-sibling’s child or grandchild.

Let’s take a look at this situation and how I used DNAPainter to quickly narrow the possibilities, even with no additional information.

The Problem

Here’s my match to Dee (not her name) at MyHeritage.

Dee matches me at 521 cM on 17 segments.

Taking a quick look at the DNAPainter Shared cM Tool, you can see that Dee falls into the non-dimmed relationship ranges below, with dark grey being the most probable.

The most likely relationships are shown in the table below.

Dee is in her 50s, so she’s clearly not my great aunt or uncle or grandparent.

The Possibilities

Based on who she matches, I know the match is from my father’s side. I have no full siblings and my mother’s DNA is at MyHeritage.

My father could have been begetting children beginning about 1917 or so and could have continued through his death in 1963.

My half sister’s daughter has also tested at MyHeritage, and Dee matches her more distantly than me, so Dee is not an unknown descendant of my half-sister.

Dee could have been a child or grandchild of a half sibling that I’m unaware of – which of course is my burning question.

I checked the in-common-with matches and while they made sense, I needed something much faster than working with multiple trees and matches and attempting to build them out.

Besides, I desperately wanted a quick answer.

DNAPainter to the Rescue

I’ve written three previous articles about utilizing DNAPainter.

I continue to paint matches where I can identify known ancestors. Currently, I’m up to 689 segments identified and painted which is about 62% of my genome.

Surely this investment should pay off now, if I can only figure out how.

I’ve painted hundreds of segments on both my paternal grandmother and grandfather’s sides. If Dee is a half sibling (descendant) to me, she will match both my paternal grandmother’s line and my paternal grandfather’s line. If Dee is related on one of those lines, but not the other, then Dee will match one grandparent’s line, but not the other grandparent’s line.

Dee can’t be descended from a half sibling if she doesn’t match both of my paternal grandparents, meaning William George Estes and Ollie Bolton’s lines.

Painting

The first thing I did was to paint the segments where Dee and I match, assigning a unique color.

After painting, I compared each chromosome individually, looking at the other ancestors painted that overlapped with the bright yellow.

The next step was to look at each chromosome and see which ancestor’s DNA overlaps with Dee’s.

Without fail, every single one of these segments matched with my paternal grandfather’s side, and none matched with my paternal grandmother’s side.

To confirm, I have a cousin, we’ll call him Buzz, whose ancestor was my grandmother’s brother, so Buzz is my second cousin. If Dee is my half sibling’s child or grandchild, Buzz, who also tested at MyHeritage, would be Dee’s second cousin or second cousin once removed. No second cousins have ever been proven NOT to match, so it’s extremely unlikely that Dee is descended through Ollie Bolton.

Is there a very small possibility? Yes, if Dee is actually a second cousin twice removed from Buzz, which is genetically the equivalent of a third cousin. Third cousins only match about 90% of the time.

However, Dee also doesn’t match anyone else on my grandmother’s side, so it’s very unlikely that Dee descends from Ollie Bolton’s parents, Joseph “Dode” Bolton and Margaret Clarkson/Claxton.

Therefore, we’ve just “proven,” as best we can, that Dee does NOT descend from a previously unknown half-sibling.

We’ll just pause for a minute here – I was so hopeful☹

Regroup – Other Possible Relationships

OK, redraw the chart without Ollie. Dee is still very closely related, so what are the other possibilities?

Dee does match people with ancestors from both the lines of Lazarus Estes and Elizabeth Vannoy, so Dee is either an unknown descendant of William George Estes or his parents, given how closely she matches me and other descendants of this family.

Or… as luck would have it, Dee could also be descended from the sister of Lazarus Estes (Elizabeth Estes) who married the bother of Elizabeth Vannoy (William George Vannoy.) Yes, siblings married siblings. Two children of Joel Vannoy and Phoebe Crumley married two children of John Y. Estes and Rutha (or Ruthy) Dodson.

You know, these mysteries can never be simple, can they?

In the chart above, gold represents the people who descend from a combination of a pink and blue couple. Joel Vannoy and Phoebe Crumley are shown twice because there was no easy way to display this couple.

One way or another Dee and I are related through these two couples. Of course, I’m curious as to how, and excited to help Dee learn about her family, but this isn’t going to be an easy solve, because of the potential double descent. Under normal circumstances, meaning NOT doubly related, Dee is most likely my half-great niece, meaning that her unknown grandparent is either a child of William George Estes (my grandfather) or descended from his parents, Lazarus Estes and Elizabeth Vannoy.

However, the doubling of DNA in the William George Vannoy/Elizabeth Estes line would make Dee look a generation closer if she descends from that line, so the genetic equivalent of descending from Lazarus Estes and Elizabeth Vannoy. The only way to solve for this equation would be to see how closely she matches a descendant of Elizabeth Estes and William George Vannoy – and no one from that line is known to have tested today.

For now, my driving question of whether I had discovered an unknown half-sibling has (most probably) been answered between the segment information at MyHeritage combined with the functionality of DNAPainter.

_____________________________________________________________________

Standard Disclosure

This standard disclosure appears at the bottom of every article in compliance with the FTC Guidelines.

I provide Personalized DNA Reports for Y and mitochondrial DNA results for people who have tested through Family Tree DNA. I provide Quick Consults for DNA questions for people who have tested with any vendor. I would welcome the opportunity to provide one of these services for you.

Hot links are provided to Family Tree DNA, where appropriate. If you wish to purchase one of their products, and you click through one of the links in an article to Family Tree DNA, or on the sidebar of this blog, I receive a small contribution if you make a purchase. Clicking through the link does not affect the price you pay. This affiliate relationship helps to keep this publication, with more than 900 articles about all aspects of genetic genealogy, free for everyone.

I do not accept sponsorship for this blog, nor do I write paid articles, nor do I accept contributions of any type from any vendor in order to review any product, etc. In fact, I pay a premium price to prevent ads from appearing on this blog.

When reviewing products, in most cases, I pay the same price and order in the same way as any other consumer. If not, I state very clearly in the article any special consideration received. In other words, you are reading my opinions as a long-time consumer and consultant in the genetic genealogy field.

I will never link to a product about which I have reservations or qualms, either about the product or about the company offering the product. I only recommend products that I use myself and bring value to the genetic genealogy community. If you wonder why there aren’t more links, that’s why and that’s my commitment to you.

Thank you for your readership, your ongoing support and for purchasing through the affiliate link if you are interested in making a purchase at Family Tree DNA, or one of the affiliate links below:

Affiliate links are limited to:

Ancestry Step by Step Guide: How to Upload-Download DNA Files

In this Upload-Download Series, we’ll cover each major vendor:

  • How to download raw data files from the vendor
  • How to upload raw data files to the vendor, if possible
  • Other mainstream vendors where you can upload this vendor’s files

Uploading TO Ancestry

This part is easy with Ancestry, because Ancestry doesn’t accept any other vendor’s files. There is no ability to upload TO Ancestry. You have to test with Ancestry if you want results from Ancestry.

Downloading FROM Ancestry

In order to transfer your autosomal DNA file to another testing vendor, or GedMatch, for either matching or ethnicity, you’ll need to first download the file from Ancestry.

Step 1

Sign in to your account at Ancestry and click on the DNA Results Summary link.

Step 2

Click on the Settings gear, at the far upper right hand corner of the summary page, just beneath your Ancestry user ID.

Step 3

Click on the link for “Download Raw DNA Data.”

Step 4

Enter your password and click on “I Understand,” after reading of course.

At that point, the confirm button turns orange – click there.

Step 5

Ancestry will send an e-mail to the e-mail address where you are registered with Ancestry. Check your inbox for that e-mail.

Waiting…waiting.

Still waiting…

If the e-mail doesn’t arrive shortly, check your spam folder. If you’ve changed e-mail addresses, check to be sure your new one is registered with Ancestry. That’s on the same Settings page. If all else fails, request the e-mail again.

Step 6

Ahhh, it’s finally here.

Click on the green “Confirm Data Download” and do not close the window.

Step 7

Next, click on the green “Download DNA Raw Data.”

You’ll see the following confirmation screen.

Step 8

At the bottom of the page, above, if you’re on a PC, you’ll see the typical file download box that asks you if you want to open or save. Save the file as a name you can find later when you want to upload to another site.

The file name will be “dna-data-2018-07-31” where the date is the date you downloaded the file. I would suggest adding the word Ancestry to the front when you save the file on your system.

Most vendors want an unopened zip file, so if you want to open your file, first copy it to another name. Otherwise, you’ll have to download again.

That’s it, you’re done!

Ancestry File Transfers to Other Vendors

Ancestry testing falls into two different categories. V1 tests taken before May of 2016 and V2 tests taken after May 2016. Tests processed during May 2016 could be either version.

The difference between V1 and V2 files is that Ancestry changed the chips they use to test and different DNA positions are tested, resulting in a file of a different format.

If you don’t remember when you tested, make a copy of your Ancestry file using a different name, like, “Opened Ancestry file 7-31-2018.” Then just click to open the zip file.

The first four rows of the file will say something like this:

#AncestryDNA raw data download
#This file was generated by AncestryDNA at: 08/11/2017 07:23:49 UTC
#Data was collected using AncestryDNA array version: V1.0
#Data is formatted using AncestryDNA converter version: V1.0

This is a version 1 (V1) file.

A version 2 file will say V2.0.

Your upload results to other vendors’ sites will vary in terms of both matching and ethnicity accuracy based on your Ancestry version number, as follows:

From below to >>>>>>>>>>> Family Tree DNA Accepts ** MyHeritage Accepts*** 23andMe Accepts* GedMatch Accepts ****
Ancestry before May 2016 (V1) Yes, fully compatible Yes, fully compatible No Yes
Ancestry after May 2016 (V2) Yes, partly compatible Yes, fully compatible No Yes

*Note that 23andMe earlier in 2018 allowed a one-time transfer from Ancestry, but people who transferred results did not receive matches from 23andMe.

**Note that the transfer to Family Tree DNA and matching is free, but advanced tools including the chromosome browser and ethnicity require a $19 unlock fee. That fee is less expensive than retesting, but V2 customers should consider retesting to obtain fully compatible matching and ethnicity results. V2 tests typically receive only the closest 20-25% of matches they would receive if they tested directly at Family Tree DNA.

***MyHeritage utilizes a technique known as imputation to achieve compatibility between different vendors files. The transfer and tools are free, but without a subscription you can’t fully utilize all of the MyHeritage benefits available.

****I’m not sure exactly how GedMatch compensates for the V1 versus V2 differences, but they can handle both data file types. Most people don’t take both tests, but I was conducting an experiment and have uploaded both V1 and V2 tests.

A quick survey of GedMatch matches to my Ancestry V1 and Ancestry V2 kits shows that of my first 249 (125 V2, 124 V1) matches, I have 3 V1 tests that don’t have a corresponding match to a person on the V2 kit, and 5 V2 kits that don’t have a corresponding V1 kit match. That’s roughly a 6% nonmatch rate between Ancestry V1 and V2 kits. I would presume that as the genealogical and genetic distance increases with more distant matches, so would the percentage of non-matches because the segment size is smaller with more distant matches, so there is less matching DNA to have the opportunity to match in the first place.

Testing and Transfer Strategy

My recommendation, if you test at Ancestry, is to transfer your V1 results to MyHeritage, Family Tree DNA and GedMatch.

An Ancestry V1 test is entirely compatible at Family Tree DNA, but with a V2 test, because the testing platform that Ancestry uses is only about 20-25% compatible with the Family Tree DNA test, you’ll only receive your closest 20-25% matches. Family Tree DNA can’t match on those smaller segments if you don’t test on a compatible platform, so please do.

If you have Ancestry V2 results, transfer to MyHeritage and GedMatch but retest at Family Tree DNA. The cost difference at Family Tree DNA between the $19 unlock and a new Family Finder test is $60, for a total of $79 when the tests aren’t on sale. When they are on sale, it’s less. Right now, the tests are only $59.

You never know which match is going to break down that brick wall, and it would be a shame to miss it because you transferred rather than retested.

Matching and ethnicity is free with a transfer to MyHeritage, but you won’t receive the full potential benefit of SmartMatching without a subscription, as free trees are limited to 250 people and genealogical records aren’t included without a subscription. My subscription has been well worth the $.

_____________________________________________________________________

Standard Disclosure

This standard disclosure appears at the bottom of every article in compliance with the FTC Guidelines.

I provide Personalized DNA Reports for Y and mitochondrial DNA results for people who have tested through Family Tree DNA. I provide Quick Consults for DNA questions for people who have tested with any vendor. I would welcome the opportunity to provide one of these services for you.

Hot links are provided to Family Tree DNA, where appropriate. If you wish to purchase one of their products, and you click through one of the links in an article to Family Tree DNA, or on the sidebar of this blog, I receive a small contribution if you make a purchase. Clicking through the link does not affect the price you pay. This affiliate relationship helps to keep this publication, with more than 900 articles about all aspects of genetic genealogy, free for everyone.

I do not accept sponsorship for this blog, nor do I write paid articles, nor do I accept contributions of any type from any vendor in order to review any product, etc. In fact, I pay a premium price to prevent ads from appearing on this blog.

When reviewing products, in most cases, I pay the same price and order in the same way as any other consumer. If not, I state very clearly in the article any special consideration received. In other words, you are reading my opinions as a long-time consumer and consultant in the genetic genealogy field.

I will never link to a product about which I have reservations or qualms, either about the product or about the company offering the product. I only recommend products that I use myself and bring value to the genetic genealogy community. If you wonder why there aren’t more links, that’s why and that’s my commitment to you.

Thank you for your readership, your ongoing support and for purchasing through the affiliate link if you are interested in making a purchase at Family Tree DNA, or one of the affiliate links below:

Affiliate links are limited to:

Ancestors: What Constitutes Proof?

All genealogists should be asking this question for every single relationship between people in their trees – or at least for every person that they claim as an ancestor. The answer differs a bit when you introduce DNA into the equation, so let’s discuss this topic.

It’s easier to begin by telling you what proof IS NOT, rather than what proof is.

What is Proof, Anyway?

First of all, what exactly do we mean by proof? Proof means proof of a relationship, which has to be proven before you can prove a specific ancestor is yours. It’s a two-step process.

If you’re asking whether those two things are one and the same, the answer is no, they are not. Let me give you a quick example.

You can have proof that you descend from the family of a specific couple, but you may not know which child of that couple you descend from. In one case, my ancestor is listed as an heir, being a grandchild, but the suit doesn’t say which of the man’s children is the parent of my ancestor. So frustrating!

Conversely, you may know that you descend from a specific ancestor, but not which of his multiple wives you descend from.

You may know that your ancestor descends from one of multiple sons of a particular man, but not know which son.

Therefore, proof of a relationship is not necessarily proof that a particular person is your ancestor.

Not Proof of an Ancestor

OK, so what’s NOT proof? Here are a dozen of the most common items – and there are surely more!

  1. Proof is not a DNA match alone. You can match as a result of ancestors on any number of lines, known or unknown.
  2. Proof is not an oral history, no matter how much you want to believe it or who said it. Oral history is a good starting point, not an end point.
  3. Proof is not, not, 1000 times NOT someone else’s tree. A tree should be considered a hint, nothing more.
  4. Proof is not a book without corresponding evidence that can be independently corroborated. Being in print does not make it so, people make mistakes and new information surfaces.
  5. Proof is not a man by the name of Jr., meaning that he is the son of a man by the same name with the suffix of Sr. Sr. often means older and Jr. means younger, but not necessarily related. Yes, this has bitten me.
  6. Proof of a father/son relationship is not two men with the same name in the same location.
  7. Proof is not a Y DNA match, at least not without additional information or evidence, although it’s a great hint!
  8. Proof is not an autosomal DNA match, unless it is an extremely close match and even then you (probably) need additional information. For example, if you have a half-sibling match, you need additional information to determine which parent’s side.
  9. Proof is not an Ancestry Circle, at least not without additional information.
  10. Proof is not similar or even identical ethnicity, or lack thereof.
  11. Proof is not a “DNA Proven” icon, anyplace.
  12. Proof is not a will or other document, at least not alone, and not without evidence that a person by the same name as the child is the RIGHT person.

I learned many of these NOTS or KNOTS as I prefer to call them, because that’s what they tie me in, by ugly experience. I began genealogy before there were proof standards, let alone the GPS (Genealogical Proof Standard). DNA adds yet another dimension to existing paper standards and is an important aspect of the requirement for a “reasonably exhaustive search.” In fact, there is no reason NOT to include DNA and I would suggest that any genealogical search is not complete without including genetic evidence.

Proof Is a Two-Way Street

Using traditional genealogy, genealogists must be able to prove not only that an ancestor had a child by a specific name, but that the person you believe is the child, is indeed the child of that ancestor.

Let me use an example of Daniel, the son of one Philip Jacob Miller in Washington County, Maryland in 1783.

The tax list shows Philip J. Miller, 15 entries from the bottom of the page, shown below. It also shows “Daniel Miller of Philip” 6 entries from the bottom, and it’s our lucky day because the tax list says that Daniel is Philip’s son.

But wait, there’s another Daniel, the bottom entry. If you were to look on the next page, you would also notice that there’s a Philip Miller who does not own any land.

What we have here is:

  • Philip J. Miller, with land
  • Daniel, son of Philip, no land
  • Daniel, no father listed, land
  • Philip, no land

This just got complex. We need to know which Philip is Daniel’s father and which Daniel is which Philip’s son.

Establishing proof requires more than this one resource.

The great news about this tax list is that it tells us how much land Philip J. Miller owned, and utilizing other resources such as deeds and surveys, we can establish which Philip J. Miller owned this land, and that his name was indeed Philip Jacob Miller. This is important because not only is there another Philip, who, by the way, is NOT the son of Philip Jacob Miller (knot #6 above), there is also another Jacob Miller, who is NOT Philip Jacob Miller and who isn’t even related to him on the Miller line, according to the Y DNA of both men’s descendants.

How would we prove that Philip Jacob Miller is the father of Daniel Miller? We’d have to follow both men backward and forward in time, together. We have great clues – land ownership or lack thereof.

In this case, Philip Jacob Miller eventually sells his land. Philip Jacob Miller also has a Bible, which is how we know that there is no son named Philip. Philip Jacob’s son, Daniel leaves with his brother David, also on this tax list, travels to another location before the family is reunited after moving to Kentucky years later, where Philip Jacob Miller dies with a will. All of his heirs sign property deeds during probate, including heirs back in Frederick and Washington County, Maryland. There is enough evidence from multiple sources to tie these various family members from multiple locations conclusively together, providing two way proof.

We must be able to prove that not only did Philip Jacob Miller have a son Daniel, but that a specific Daniel is the son of that particular Philip Jacob Miller. Then, we must repeat that exact step every generation to the present to prove that Philip Jacob Miller is our ancestor.

In other words, we have a chain of progressive evidence that taken together provides conclusive proof that these two men are BELIEVED to be related. What? Believed? Don’t we have proof now?

I say believed, because we still have issues like unknown parentage, by whatever term you wish to call it, NPE (nonpaternal event, nonparental event,) or MP (misattributed parentage,) MPE (misattributed paternal or parental event) or either traditional or undocumented adoptions. Some NPEs weren’t unknown at the time and are results of situations like a child taking a step-parent’s surname – but generations later – having been forgotten or undocumented for descendants, the result is the same. They aren’t related biologically in the way we think they are.

The Big Maybe

At this point, we believe we have the Philips, Philip Jacobs and Daniels sorted correctly relative to my specific line. We know, according to documentation, that Daniel is the son of Philip Jacob, but what if MY ancestor Daniel ISN’T the son of Philip Jacob Miller?

  • What if MY ancestor Daniel just happens to have the name Daniel Miller and lives in the same geography as Philip Jacob Miller, or his actual son Daniel, and I’ve gotten them confused?
  • What if MY ancestor Daniel Miller isn’t actually my ancestor after all, for any number of reasons that happened between when he lived and died (1755-1822) and my birth.

If you think I’m being facetious about this, I’m not. Not long after I wrote the article about my ancestor Daniel Miller, we discovered another Daniel Miller, living in the same location, also descended from the same family as evidenced by BOTH Y and autosomal DNA. In fact, there were 12 Daniel Millers I had to sort through in addition to the second Daniel on the 1783 tax list. Yes, apparently Daniel was a very popular name in the Miller family and yes, there were several male sons of immigrant Johann Michael Muller/Miller who procreated quite successfully.

Enter DNA

If DNA evidence wasn’t already a factor in this equation, it now must come into play.

In order to prove that Philip Jacob Miller is my ancestor, I must prove that I’m actually related to him. Of course, the methodology to do that can be approached in multiple ways – and sometimes MUST be approached using different tools.

Let’s use an example that actually occurred in another line. Two males, Thomas and Marcus Younger, were found together in Halifax County, Virginia, right after the Revolutionary War. They both had moved from Essex County, and they consistently were involved in each other’s lives as long as they both lived. They lived just a couple miles apart, witnessed documents for each other, and until DNA testing it was believed that Marcus was the younger brother of Thomas.

We know that Marcus was not Thomas’s son, because he was not in Thomas’s will, but Marcus and his son John both witnessed Thomas’s will. In that time and place, a family member did not witness a will unless it was a will hastily constructed as a person was dying. Thomas wrote his will 2 years before it was probated.

However, with the advent of DNA testing, we learned that the two men’s descendants did not carry the same Y DNA – not even the same haplogroup – so they do not share a common paternal ancestor.

Needless to say, this really threw a monkey wrench into our neat and tidy family story.

Later, the will of Thomas’s father, Alexander, was discovered, in which Marcus was not listed (not to mention that Alexander died before Marcus was born,) and, Thomas became the guardian of his three sisters.

Eventually, via autosomal DNA, we proved that indeed, Marcus’s descendants are related to Thomas’s descendants as well as other descendants of Thomas’s parents. We have a proven relationship, but not a specifically proven ancestor. In other words, we know that Marcus is related to both Thomas and Alexander, we just don’t know exactly how.

Unfortunately, Marcus only had one son, so we can’t confirm Marcus’s Y DNA through a second line. We also have some wives missing from the equation, so there is a possibility that either Marcus’s wife, or his unknown biological father’s family was otherwise related to Alexander’s line.

So, here’s the bottom line – we believe, based on various pieces of compelling but not conclusive evidence that Marcus is the illegitimate child of one of Thomas’s unmarried sisters, who died, which is why Marcus is clearly close to Thomas, shares the same surname, but not the Y DNA. In fact, it’s likely that Marcus was raised in Thomas’s household.

  • It’s entirely possible that if I incorrectly listed Thomas as Marcus’s father on Ancestry, as many have, that I would be placed in a Thomas circle, because Ancestry forms circles if your autosomal DNA matches and you show a common ancestor in your trees. This is why inclusion in a circle doesn’t genetically confirm an ancestor without additional information. It confirms a genetic relationship, but not how a person is related.
  • It’s entirely possible that even though Marcus’s Y DNA doesn’t match the proven Y DNA of Thomas, that Marcus is still closely related to Thomas – such as Marcus’s uncle. That’s why Marcus’s descendants match both Thomas’s and Alexander’s descendants through autosomal testing. However, without Y DNA testing, we would never know that they don’t share a paternal line.
  • It’s entirely possible that if Marcus was supposed, on paper, to be Thomas’s child, but was fathered by another man, such as his wife’s first husband, I would still be in the circle attributed to both Thomas and his wife, by virtue of the fact that I match DNA of Thomas’s descendants through Thomas’s wife. This is your classic step-father situation.

Paper is Not Proof

As genealogists, we became so used to paper documentation constituting proof that it’s a blow when that paper proves to be irrelevant, especially when we’ve hung our genealogical hat on that “proof” for years, sometimes decades.

The perfect example is an adoption. Today, most adoptions are through a court of law, but in the past, a functional adoption happened when someone, for whatever reason, took another child to raise.

The history of that “adoption” although not secret when it happened, became lost in time, and the child is believed to be the child of the couple who raised them. The adoption can actually be a step-parent situation, and the child may carry the step-father’s surname but his own father’s Y DNA, or it can be a situation where a relative or unrelated couple raised the child for some unknown reason.

Today, all paper genealogy needs to be corroborated by DNA evidence.

DNA evidence can be some combination of:

  • Y DNA
  • Autosomal DNA
  • Mitochondrial DNA

How Much Proof is Enough?

One of my favorite saying is “you don’t know what you don’t know.”

People often ask:

  1. If they match someone autosomally who shares the same ancestor, do they really need to prove that line through Y or mitochondrial DNA?
  2. Do they really need to match multiple people?
  3. Do they really need to compare segments?

The answers to these is a resounding, “it depends.”

It depends on the circumstances, the length of time back to the common ancestor, and how comfortable you are not knowing.

Relative to question 1 about autosomal plus Y DNA, think about Marcus Younger.  Without the Y DNA, we would have no idea that his descendant’s Y DNA didn’t match the Thomas Younger line. Suddenly, Marcus not being included in either Thomas nor Alexander’s will makes sense.

Relative to question 2 about matching multiple people, the first cousin we tested to determine whether it was me or my brother that was not the child of our father turned out to have different Y DNA than expected. Thank goodness we tested multiple people, including autosomal when it became available.

Relative to question 3 about comparing segments, every matching segment has its own unique history. I’ve encountered several situations where I match someone on one segment from one ancestor, and another segment from an entirely different line. The only way to determine this is by comparing and triangulating individual segments.

I’ve been bitten so many times by thinking I knew something that turned out to be incorrect that I want every single proof point that I can obtain to eliminate the possibility of error – especially multiple kinds of DNA proof. There are some things that ONLY DNA can reveal.

I want:

  • Traditional documentary evidence for every generation to establish the actual paper trail that proves that the child descends from the proper parents.
  • Y DNA to prove the son is the son of the father and to learn about the deeper family history. For example, my Lentz line descends from the Yamnaya culture, something I would never have known without the Big Y DNA test.
  • Mitochondrial DNA to prove that the mother is the actual mother of the child, if possible, not an unknown earlier or later wife, and to learn about the deeper family history. Elizabeth Mehlheimer’s mitochondrial DNA is Scandinavian – before her ancestors are found in Germany.
  • Autosomal DNA to prove that the paper lineage connecting me to the ancestor is correct and the line is not disrupted by a previously unknown adoption of some description.

I attempt to gather the Y and mitochondrial DNA haplogroup of every ancestor in my direct line if possible and confirm using autosomal DNA.

Yes, my personal proof standard is tough, but I suggest that you at least ask these questions when you evaluate documentation or see someone claim that they are “DNA proven” to an ancestor. What, exactly, does that mean and what do they believe constitutes proof? Do they have that proof, and are they willing to share it with you?

Genealogical Proofs Table

The example table below is designed to be used to document the sources of proof that the individual listed under the name column is in fact the child of the father and mother shown. Proofs may vary and could be personal knowledge (someone you knew within your lifetime), a Bible, a will, a deed, an obituary, death certificate, a church baptismal document, a pension application, census records, etc. DNA confirmation is needed in addition to paper documentation. The two types of proof go hand in hand.  

Name Birth Death Spouse Father Mother Proofs – Sources DNA Confirmed
William Sterling Estes Oct. 1, 1902, Claiborne Co., TN Aug. 27, 1963, Jay Co., IN Barbara Ferverda William George Estes 1873-1971 Ollie Bolton 1874-1955 Personal knowledge – William is my father and William George is my grandfather. Autosomal triangulated to multiple Estes cousins
William George Estes March 30, 1873, Claiborne Co., TN Nov. 29, 1971, Harlan Co., KY 1. Ollie Bolton

2.  Joyce Hatfield

3. Crocia Brewer

Lazarus Estes 1845-1918 Elizabeth Vannoy 1846-1918 1.  Will of Lazarus Estes Claiborne Co., Tn. Will Book 8, page 42

2.  Deed where Lazarus states William George is his son.  Claiborne Co., Deed Book M2, page 371.

3. My father’s personal knowledge and birth certificate

Autosomal triangulated to multiple descendants of both Lazarus Estes and Elizabeth Vannoy.
Lazarus Estes May 1845, Claiborne Co., TN 1916-1918, Claiborne Co., TN Elizabeth Vannoy John Y. Estes 1818-1895 Rutha Dodson 1820-1903 1. Personal knowledge of George Estes, now decd

2.  Deed here John Y. deeds all his possessions to his eldest son, Lazarus when he goes to Texas, Claiborne Co., Deed book B1, page 37.

Y DNA confirmed to haplotype of Abraham Estes, autosomal triangulated to descendants of Lazarus and Elizabeth and upstream ancestors through multiple matches on both sides.
John Y. Estes December 29, 1818, Halifax Co., VA Sept. 19, 1895, Montague Co., TX Rutha Dodson John R. Estes 1785/88-1885 Nancy Ann Moore c 1785-1860/1870 1. Family visits of his children in Tennessee

2. Census records, 1850, 1860, Claiborne Co., Tn. shows families in same household

Y DNA confirmed through multiple sons. Autosomal triangulates to several descendants through multiple lines of other children.
John R. Estes 1785-1788, Halifax Co. VA May 1885, Claiborne Co., TN Nancy Ann Moore George Estes 1763-1869 Mary Younger bef 1775-1820/1830 1. Halifax County 1812 personal property tax list where John R. Estes is listed as the son of George Estes and lives next to him.  Only 1 George in the county. Later chancery suit lists John R.’s wife’s name and location in Tennessee Y DNA confirmed through multiple lines.  Autosomal confirmed triangulation of multiple lines of his children and his ancestors on both sides.

If you’d like to read more about the difference between evidence and proof, and how to get from evidence to proof, check out this article, What is proof of family history? by my cousin, retired attorney, Robin Rankin Willis.

Proof is a Pain!

So now that we’ve discussed what proof is not, and what types of records constitute proof, you may be thinking to yourself that proof is a pain in the behind. Indeed, it is, but without sufficient proof, you may literally be doing someone else’s genealogy or the genealogy of an ancestor that’s not your own. Trust me, that’s infinitely more painful.

I hate sawing branches off of my own tree. If I have to do it, the sooner I make the discovery and get it over with, the better.

Been there, done that, and really, I don’t want the t-shirt.

There is never such a thing as “too much” proof, but there is certainly too little. We are fortunate to live in a time when not only are historical records available, but the record passed by our ancestors inside our very cells tells their story. Use every tool and every type of DNA at your disposal! Otherwise, you get the t-shirt:)

_____________________________________________________________________

Standard Disclosure

This standard disclosure appears at the bottom of every article in compliance with the FTC Guidelines.

I provide Personalized DNA Reports for Y and mitochondrial DNA results for people who have tested through Family Tree DNA. I provide Quick Consults for DNA questions for people who have tested with any vendor. I would welcome the opportunity to provide one of these services for you.

Hot links are provided to Family Tree DNA, where appropriate.  If you wish to purchase one of their products, and you click through one of the links in an article to Family Tree DNA, or on the sidebar of this blog, I receive a small contribution if you make a purchase.  Clicking through the link does not affect the price you pay.  This affiliate relationship helps to keep this publication, with more than 900 articles about all aspects of genetic genealogy, free for everyone.

I do not accept sponsorship for this blog, nor do I write paid articles, nor do I accept contributions of any type from any vendor in order to review any product, etc.  In fact, I pay a premium price to prevent ads from appearing on this blog.

When reviewing products, in most cases, I pay the same price and order in the same way as any other consumer. If not, I state very clearly in the article any special consideration received.  In other words, you are reading my opinions as a long-time consumer and consultant in the genetic genealogy field.

I will never link to a product about which I have reservations or qualms, either about the product or about the company offering the product.  I only recommend products that I use myself and bring value to the genetic genealogy community.  If you wonder why there aren’t more links, that’s why and that’s my commitment to you.

Thank you for your readership, your ongoing support and for purchasing through the affiliate link if you are interested in making a purchase at Family Tree DNA, or one of the affiliate links below:

Affiliate links are limited to:

Dateline: Father’s Day – The Unexpected Gift

On Father’s Day, NBC’s Dateline aired a full segment about what happened to one family as a result of DNA testing. And it’s not at all what they expected.

A woman tested her DNA, but the family she found was not the family she was looking for.

“I knew everybody, right???”

“She’s just been waiting for us all these years….”

“A moment 50 years in the making…”

“It was a gaping hole…”

Put another way, by Bennett Greenspan, CEO, Family Tree DNA, “History may get righted.”

“DNA is like a history book written into your cells and only now in the beginning of the 21st century are we learning how to read the book.” – Bennett Greenspan

“It was the middle of the night.  He told her he found me.  I can hear her crying…”

“He couldn’t hardly talk…”

“We watched pain turn into joy.”

Poverty and prejudice is evil. In all of its incantations.

Two families about to become one.

There is absolutely no way on this earth that you can get through this dry-eyed, so just get the box of Kleenex now and click the link to watch the segment.

https://www.nbc.com/dateline/video/fathers-day/3745516

_____________________________________________________________________

Standard Disclosure

This standard disclosure appears at the bottom of every article in compliance with the FTC Guidelines.

Hot links are provided to Family Tree DNA, where appropriate.  If you wish to purchase one of their products, and you click through one of the links in an article to Family Tree DNA, or on the sidebar of this blog, I receive a small contribution if you make a purchase.  Clicking through the link does not affect the price you pay.  This affiliate relationship helps to keep this publication, with more than 900 articles about all aspects of genetic genealogy, free for everyone.

I do not accept sponsorship for this blog, nor do I write paid articles, nor do I accept contributions of any type from any vendor in order to review any product, etc.  In fact, I pay a premium price to prevent ads from appearing on this blog.

When reviewing products, in most cases, I pay the same price and order in the same way as any other consumer. If not, I state very clearly in the article any special consideration received.  In other words, you are reading my opinions as a long-time consumer and consultant in the genetic genealogy field.

I will never link to a product about which I have reservations or qualms, either about the product or about the company offering the product.  I only recommend products that I use myself and bring value to the genetic genealogy community.  If you wonder why there aren’t more links, that’s why and that’s my commitment to you.

Thank you for your readership, your ongoing support and for purchasing through the affiliate link if you are interested in making a purchase at Family Tree DNA, or one of the affiliate links below:

Affiliate links are limited to:

DNA Painter – Touring the Chromosome Garden

This is the third article in a series about DNA Painter. To know DNA Painter is to love DNA Painter! Trust me!

The first two articles are:

The Chromosome Sudoku article introduces you to DNA Painter, it’s purpose and how to use the tool. The Mining Vendor Data article illustrates exactly how to find the segments you can paint from each of the main autosomal testing vendors and GedMatch.

This article is a leisurely tour through my colorful chromosome garden so that, together, we can see examples of how to utilize the information that chromosome painting unveils.

Chromosome painting can do amazing things: walk you back generations, show visual phasing…and reveal that there’s a mistake someplace, too.

If you’re not willing to be wrong and reconsider, this might not be the field for you😊

Automatic Triangulation

Chromosome painting automatically mathematically triangulates your DNA and in a much easier way than the old spreadsheet method. In fact, triangulation just happens, effortlessly IF you can determine which side is maternal and which side is paternal. Of course, you’ll always want to check to be sure that your matches also match each other. if not, then that’s an indication that maybe one or both are identical by chance.

The definition of triangulation in this context means:

  • To find a common segment
  • Of reasonable size (generally 7cM or over)
  • That is confirmed to a common ancestor with at least two other individuals
  • Who are not close family

Close family generally means parents, siblings, sometimes grandparents, although parents and grandparents can certainly be used to verify that the match is valid. The best triangulation situation is when you match those two other people through a second child, meaning siblings of your ancestor.

Different matches, depending on the circumstances, have a different level of value to you as a genealogist. In other words, some are more solid than others.

The X chromosome has special matching and triangulation rules, so we’ll talk about that when we get to that section.

Don’t think of chromosome painting as “doing” triangulation, because triangulation is a bonus of chromosome painting, and it just happens, automatically, so long as you can confirm that the segment is from either your maternal or paternal line.

What does triangulation look like in DNA Painter?

Here’s what my painted chromosome 15 looks like.

Here, I’ve drawn boxes around the areas that are triangulated. Actually, I made a small mistake and omitted one grey bar that’s also part of a second triangulation group. Can you spot it? Hint – look at the grey bars at far right in the overlapping triangulation group boxes where the red arrow is pointing. The box below should extend upwards to incorporate part of that top grey bar too.

Triangulation are those several segments piled up on top of each other. It means they match you at the same address on either the maternal or paternal chromosome. That’s good, but it’s not the same as an official “pileup area.”

Ok, so what’s a pileup area?

Pileup Areas

Certain locations in the human genome have been designated as pileup regions based on the fact that many people will match on these segments, not necessarily because they share a common relatively recent ancestor, but instead because a particular segment has a very high frequency in the general human population, or in the population of a specific region. Translated, this means that the segment might not be relevant to genealogy.

But before going too far with this discussion, it doesn’t mean that matches in pileup regions aren’t relevant to genealogy – just consider it a caution sign.

Aside from chromosome 6, which includes the HLA region, I’ve always been rather suspicious of pileup regions, because they don’t seem to hold true for me. You can view a chart that I assembled of the known pileup regions here.

DNA Painter generously includes pileup region warnings, in essence, along a chromosome bar at the top indicating “shared” or “both.”

Please note that you can click to enlarge any image.

Pileups regions are indicated by the grey hashed region at right. In my case, on chromosome 1, the pileup region isn’t piled up at all, on either the paternal (blue) chromosome or the maternal (pink) chromosome.

As you can see, I have exactly one match on the maternal side (green) and one (gold) on the paternal side (with a smidgen of a second grey match) as well, with both extending significantly beyond the pileup region. There is no reason to suspect that these gold and green matches aren’t valid.

If I saw many more matches in a pileup region than elsewhere, or many small matches, or DNA that was supposed to be from multiple ancestors not in the same line, then I’d have to question whether a pileup region was responsible.

Stacked Segments

DNA Painter provides you with the opportunity to see which of your ancestors’ segments stack. Stacking is a very important concept of DNA painting.

Before we talk about stacking, notice that the legend for which segments are color coded to specific ancestors is located at right. You can also click on the little grey box beside “Shared or Both,” at left, to show the match names beside the segments.  This is very useful when trying to analyze the accuracy of the match.

I wish DNA Painter offered an option to paint the ancestor’s names beside the segments. Maybe in V2. It’s really difficult to complain about anything because this tool is both free and awesome.

I’m using Powerpoint to label this group of stacked matches for this example.

This is a situation where I know my pedigree chart really well, so I know immediately upon looking at this stacked segment group who this piece of DNA descends from.

Here’s my pedigree chart that corresponds to the stacked segment.

We attribute each DNA segment to a couple initially based on who we match. In this case, that’s William George Estes and Ollie Bolton, my grandparents. The DNA remains attributed to them until we have evidence of which individual person in the couple received that DNA from their ancestors and passed it on to their descendant.

Therefore, the pink people are the half of the couple who we now know (thanks to DNA Painter) did NOT contribute that DNA segment, because we can track the DNA directly through the yellow line until we’re once again to another genetic brick wall couple.

My father is listed at left, and the DNA path runs back to William Crumley the second and his unknown wife who is haplogroup H2a1, the yellow couple at far right. How cool is this? One of those ancestors (or a combined segment from both) has been passed intact to me today. This is not a trivial segment either at 23.3 cM. I would not expect a segment passed to 5th cousins to be that large, but it is!

Also, note that the grey segment of DNA from Lazarus Estes (1848-1918) and Elizabeth Vannoy (1847-1918) is sitting slightly to the left of the dark blue segment from William Crumley III, so part or all of the grey or blue segment may originate with a different ancestor. Perhaps we’ll know more when additional people test and match on this same segment.

Double Related

I have one person who is related to me through two different lines. I need a way to determine which line (or both) our common DNA segment descends from.

I painted the segment for both of our common ancestor couples. The pink is George Dodson (1702-1770) & Margaret Dagord. The bright blue segment is William Crumley III (1788-1859) & Lydia Brown.

Those two lines don’t converge, at least not that we know of.

Now, as I map additional people, I’ll watch this segment for a tie breaker match between the two ancestors. The gold is not a tie breaker because that’s my grandparents who are downstream of both the pink and blue ancestors.

Painted Ethnicity

23andMe does us the favor of painting our ethnicity segments and allowing us to download a file with those segments. Conversely, DNA Painter does us the favor of allowing us to paint that entire file at once.

I already know my two Native segments on chromosome 1 and 2 descend through my mother, because her DNA is Native in exactly the same location. In other words, in this case, my ethnicity segment does in fact phase to my mother, although that’s not always the case with ethnicity.

Multiple Acadian ancestors are also proven to be Native by both genealogical records and maternal and/or paternal haplogroups.

Therefore, I’ve painted my Native segments on my mother’s side in order to determine exactly from which ancestor(s) those Native segment descend.

Confirming Questionable Ancestors

One very long-standing mystery that seemed almost unsolvable was the identity of the parents of Elijah Vannoy (1784->1850). We know he was the son of one of 4 Vannoy brothers living in Wilkes County, NC. Two were eliminated by existing Bibles and other records, but the other two remained candidates in spite of sifting through every available record and resource. We were out of luck unless DNA came to the rescue. Y DNA confirmed that Elijah was descended from one of the Vannoy males, but didn’t shed light on which one.

I decided that the wives would be the key, since we knew the identity of all four wives, thankfully. Of course, that means we’d be using autosomal DNA to attempt to gather more information.

I entered one candidate couple at Ancestry as Elijah’s parents – the one I felt most likely based on tax records and other criteria – Daniel Vannoy and Sarah Hickerson.  I also entered Sarah’s parents, Charles Hickerson (c 1725-<1793) and Mary Lytle.

I began getting matches to people who descend from Charles Hickerson and Mary Lytle through children other than Sarah.

The grey segment is from a descendant of Lazarus Estes & Elizabeth Vannoy. The salmon segments are from descendants of Charles Hickerson and Mary Lytle.

These segments aren’t small, 12.8 and 16.1 cM, so I’m fairly confident that these multiple segments in combination with the Elizabeth Vannoy segment do indeed descend from Charles Hickerson and Mary Lytle.

At Ancestry, I have 5 matches to Charles Hickerson and Mary Lytle through three of their children. However, only two of the individuals has transferred their results to either Family Tree DNA, MyHeritage or GedMatch where segment information is available to customers.

Finally, the thirty year old mystery is solved!

Shifting, Sliding, Offset or Staggered Segment Groups

Occasionally, you can prove an entire large segment by groups of shifting or sliding segments, sometimes referred as offset or staggered segments.

The entire bright pink region is inherited from Jacob Lentz (1783-1870) and Fredericka Reuhl (1788-1863.) However, it’s not proven by one individual but by a combination of 6 people whose segments don’t all overlap with each other.  The top two do match very closely with me and each other, then the third spans the two groups. The bottom 3 and part of the middle segment match very closely as well.

I can conclude that the entire dark pink region from left to right descends from Jacob and Fredericka.

Two Matches – 7 Generations

Two matches is all it took to identify this segment back to George Dodson and Margaret Dagord.

The mustard match is to my grandparents (22cM), and the pink match is to George Dodson (1702-1770) and his wife (22cM) – 7 generations. These people also match each other.

Additional matches would make this evidence stronger, although a 22cM triangulated match is very significant alone. Future might also suggest ancestors further back in time.

First Chromosome Fully Mapped

I actually have chromosome 5 entirely mapped to confirmed ancestors. I’m so excited.

Uh Oh – Something’s Wrong

I found a stack that clearly indicates something is wrong.  The question is, what?

The mustard represents my paternal grandparents, so these segments could have come through either of them, although on the pedigree chart below, we can see that this came through my grandfathers line..

There is only a small overlap with the magenta (Nicholas Speak 1782-1852 and Sarah Faires 1786-1865) and green (James Crumley 1711-1764 and Catherine c1712-c1790,) which could be by chance given that the Nicholas segment is 7.5 cM, so I’m leaving the magenta out of the analysis.

However, the rest of these segments overlap each other significantly, even though they are stepped or staggered.

As you can see from the colors on the pedigree chat, it’s impossible for the green segment to descend from the same ancestor as the purple segment. The purple and orange confirm that branch of the tree, but the red cannot be from the same ancestor or the same line as the green ancestor.

I suspect that the purple and orange line is correct, because there are 4 segments from different people with the same ancestral line.

This means that we have one of the following situations with the red and green segments:

  • The smaller segments are incorrect, false positives, meaning matching by chance. The green segment is 14 cM, so quite large to match by chance. The red segment is 10 cM. Possible, but not probable.
  • The segments are population-based matches, so appear in all 3 lines. Possible, technically, but also not probable due to the segment size.
  • The segments are genuine matches, and one of the lines is also found in one of the other lines, upstream. This is possible, but this would have to be the case with both the red and green lines. To continue to weigh this possibility, I’ll be watching for similar situations with these same ancestors.
  • Some combination of the above.

I need more matches on this segment for further clarity.

Visual Phasing – Crossovers

A crossover point is where the DNA on one side of a demarcation line is descended from one ancestor and the DNA on the other side is descended from another ancestor, represented by the pink and blue halves of the segment, below.

Crossovers occur when the DNA is combined from two different ancestors when it is passed to the child. In other words, a chunk of mom’s ancestors’ DNA is contributed by mom and a chunk of dad’s ancestors’ DNA is contributed as well. The seam between different ancestor’s DNA pieces is called a crossover.

In this example, the brown lines confirmed by several testers to be from Henry Bolton (c1759-1846) and Nancy Mann (c1780-1841) is shown with a very specific left starting point, all in a vertical line. It looks for all the world like this is a crossover point. The DNA to the left would have been contributed by another, as yet unidentified, ancestor.

The gold lines above are matches from more recent generations.

Naming Those Unnamed Acadians

My Acadian ancestry is hopelessly intertwined, but chromosome painting may in fact provide me with some prayer of unraveling this ball of twine. Eventually.

When I know that someone is Acadian, but I can’t tell which of many lines I connect through, I add them as “Acadian Undetermined.”

There’s a lot of Acadian DNA, because it’s an endogamous population and they just keep passing the same segments around and around in a very limited population.

On my maternal chromosome, all of the olive green is “Acadian Undetermined.”  However, that blue segment in the stack is Rene de Forest (1670-1751) and Francoise Dugas (1678->1751).

In essence, this one match identified all of the DNA of the other people who are now simply a row in the Acadian Undetermined stack. Now I need to go back and peruse the trees of these individuals to determine if they descend form this line, or a common ancestor of this line, or if (some of) these matches are a matter of endogamy.

Endogamous matches can be population based, meaning that you do match each other, but it’s because you share so much of the same DNA because you have small pieces of many common ancestors – not because a particular segment comes from one specific ancestor. You can also share part of your DNA from Mom’s side and part from Dad’s side, because both of your parents descend from a common population and not because the entire segment comes from any particular ancestor.

On some long cold winter weekend, I’ll go through and map all of the trees of my Acadian matches to see what I can unravel. I just love matches with trees. You just can’t do something like this otherwise.

Of course, those Acadians (and other endogamous populations) can be tricky, no matter what, one click up from a needle in a haystack.

Acadian Endogamy Haystack on Steroids

At first, our haystack looks like we’ve solved the mystery of the identity of the stack.  However, we soon discover that maybe things aren’t as neat and tidy as we think.

Of course, the olive green is Acadian Undetermined, but the three other colored segments are:

  • Pink – Guillaume Blanchard (1650-1715/17) & Huguette Goujon (c1647-1717)
  • Brown/Pink – Francois Broussard (c1653-1716) & Catherine Richard (c1663-1748)
  • Coffee – Daniel Garceau (1707-1772) & Anne Doucet (1713-1791)

Looking at the pedigree chart, we find two of these couples in the same lineage, so all is good, until we find the third, pink, couple, at the bottom.

Clearly, this segment can’t be in two different lines at once, so we have a problem.  Or do we?

Working the pink troublesome lines on back, we make a discovery.

We find a Blanchard line consisting of Guilluame Blanchard born circa 1590 and Huguette Poirier also born circa 1690.

Interesting. Let’s compare the Guillaume Blanchard and Huguette Goujon line. Is this the same couple, but with a different surname for her?

No, as it turns out, Guillaume Blanchard that married Huguette Goujon was the grandson of Guilluame Blanchard and Huguette Poirier. That haystack segment of DNA was passed down through two different lines, it appears, to converge in three descendants – me, the descendant of the pink segment couple and the descendant of the brown/burgundy segment couple. This segment reaches back in time to the birth of either Guilluame Blanchard or Huguette Poirier in 1590, someplace in France, rode over on the ship to Port Royal in the very early 1600s, probably before Jamestown was settled, and has been kicking around in my ancestors and their descendants ever since.

This 18 or so cM ancestral segment is buried someplace at Port Royal, Nova Scotia, but lives on in me and several other people through at least two divergent lines.

The X Chromsome

Several vendors don’t report the X chromosome segments. I do use X segments from those who do, but I utilize a different threshold because the SNP density is about half of that on the other chromosomes. In essence, you need a match twice as large to be equivalent to a match on another chromosome..

Generally, I don’t rely on segments below 10 for anyone, and I generally only use segments over 14cM and no less than 500 SNPs.

Having just said that, I have painted a few smaller segments, because I know that if they are inaccurate, they are very easy to delete. They can remain in speculative mode. The default for DNAPainter and that’s what I use.

The great thing about the X chromosome is that because of it’s special inheritance path, you can sometimes push these segments another 2 generations back in time.

Let’s use an X chromosome match in conjunction with my X fan chart printed through Charting Companion.

On the paternal X, I inherited the gold segment from the couple, William George Estes (1873-1971) & Ollie Bolton (1874-1955.) However, since my father didn’t inherit an X from William George Estes (because my father inherited the Y from his father,) that X segment has to be from Ollie Bolton, and therefore from her parents Joseph Bolton (1853-1920) and Margaret Claxton (1851-1920.)

The segment from Lazarus Estes (1848-1918) and Elizabeth Vannoy (1847-1918) that’s 14 cM is false. It can’t descend from that couple. Same for the 7.5 cM from Jotham Brown (c1740-c1799) & Phoebe unk (c1747-c1803.) That segment’s false too. The green 48 cM segment from Samuel Claxton (1827-1876) and Elizabeth Speak (1832-1907)?  That segment’s good to go!

On my mother’s side, there’s a 7.8 cM Acadian Undetermined, which must be false, because Curtis Benjamin Lore (1856-1909) did not inherit an X chromosome from his Acadian father, Antoine Lore (1805-1862/67.)  Therefore, my X chromosome has no Acadian at all. I never realized that before, and it makes my X chromosome MUCH easier.

How about that light green 33cM segment from Antoine Lore (1805-1862/67) & Rachel Hill (1814/15-1870/80)? That segment must come from Rachel Hill, so it’s pushed back another generation to Joseph Hill (1790-1871) and Nabby Hall (1792-1874.)

I love the X chromosome because when you find a male in the line, you automatically get bumped two more generations back to his mother’s parents. It’s like the X prize for genetic genealogy, pardon the pun!

Adoptees

Some adoptees are lucky and receive close matches immediately. Others, not so much and the search is a long process.

If you’re an adoptee trying to figure out how your matches connect together, use in-common-match groupings to cluster matches together, then paint them in groups.  Utilize the overlapping segments in order to view their trees, looking for common surnames. Always start with the groups with the longest segments and the most matches. The larger the match, the more likely you are to be able to find a connection in a more recent generation. The more matches, the more likely you are to be able to spot a common surname (or two.)

Painting can speed this process significantly.

Much More Than Painting

I hope this tour through my colorful chromosomes has illustrated how much fun analysis can be. You’ll have so much fun that you won’t even realize you’re triangulating, phasing and all of those other difficult words.

If you have something you absolutely have to do, set an alarm – or you’ll forget all about it. Voice of experience here!

So, go and find some segments to paint so all of these exciting things can happen to you too!

How far back will you be able to identity a segment to a specific ancestor?  How about a triangulated segment? An X segment?

Have fun!!! Don’t forget to eat!

PS – If you’d like to learn more about Phasing, Triangulation or hear my keynote speech, consider signing up for the Virtual DNA Conference June 21-24. I’ll be presenting on both of those topics. You can sign in anytime for the next year to listen to the sessions, not just during the conference days. The keynote will be recorded and available afterwards as well.

_____________________________________________________________________

Standard Disclosure

This standard disclosure appears at the bottom of every article in compliance with the FTC Guidelines.

Hot links are provided to Family Tree DNA, where appropriate.  If you wish to purchase one of their products, and you click through one of the links in an article to Family Tree DNA, or on the sidebar of this blog, I receive a small contribution if you make a purchase.  Clicking through the link does not affect the price you pay.  This affiliate relationship helps to keep this publication, with more than 900 articles about all aspects of genetic genealogy, free for everyone.

I do not accept sponsorship for this blog, nor do I write paid articles, nor do I accept contributions of any type from any vendor in order to review any product, etc.  In fact, I pay a premium price to prevent ads from appearing on this blog.

When reviewing products, in most cases, I pay the same price and order in the same way as any other consumer. If not, I state very clearly in the article any special consideration received.  In other words, you are reading my opinions as a long-time consumer and consultant in the genetic genealogy field.

I will never link to a product about which I have reservations or qualms, either about the product or about the company offering the product.  I only recommend products that I use myself and bring value to the genetic genealogy community.  If you wonder why there aren’t more links, that’s why and that’s my commitment to you.

Thank you for your readership, your ongoing support and for purchasing through the affiliate link if you are interested in making a purchase at Family Tree DNA, or one of the affiliate links below:

Affiliate links are limited to:

Milestone! 1000 Articles About Genetic Genealogy

Today is a big day for DNA-eXplained. I christened this blog on July 11, 2012 with an invitation for the world of genetic genealogy to follow along. Wow, what a ride!

Today, about 5 weeks shy of the blog’s 6th birthday, I’m publishing my 1000th article – this one. I don’t even want to know how many words or pages, but I do know I’ve gone through two keyboards – worn the letters right off the keys.

My original goal in 2012 was to publish one article per week. That would have been 307 articles this week. I’ve averaged 3.25 articles a week. That’s almost an article every other day, which even surprises me!

That’s wonderful news for my readers because it means that there is so much potential in the genetic genealogy world that I need to write often. Even so, I always feel like there is so much to say – so much that needs to be taught and that I’ll never catch up.

I wonder, which have been the most popular articles?

Most Popular Articles

The most popular article has received almost a million views.

I’m not surprised that the article about Native American heritage and DNA testing is number one. Many people want to verify their family stories of Native American ancestry. It was and remains a very large motivation for DNA testing.

One link I expected to see on this list, but didn’t, is my Help page. Maybe because it’s a page and not an article? Maybe I should publish it as an article too. Hmmm….

What Do These Articles Have In Common?

Four are about ethnicity, which doesn’t surprise me. In the past couple of years, one of the major testing companies has pushed ethnicity testing as a “shortcut” to genealogy. That’s both a blessing and a curse.

Unfortunately, it encourages a misperception of DNA testing and what it can reasonably do, causing dissatisfaction and kit abandonment. Fortunately, advertising encourages people to test and some will go on to get hooked, upload trees and engage.

The good news is that judging from the popular articles, at least some people are researching ethnicity testing – although I have to wonder if it’s before or after they receive their test results.😊

Three articles are specifically about Native American heritage, although I suspect people who discover that they don’t carry as much Native as they expected are also reading ethnicity articles.

Two articles are specifically not about autosomal results, which pleases me because many autosomal testers don’t know about Y and mitochondrial DNA, or if they do, they don’t understand what it can do for them or how to utilize results.

Several articles fall into the research category – meaning an article someone might read to decide what tests to purchase or how to understand results.

Key Word Searchable

One of the things I love about WordPress, my blogging platform, is that DNA-eXplained is fully keyword searchable. This means that you can enter any term you want to find in the search box in the upper right-hand corner and you’ll be presented with a list of articles to select from.

For example, if you enter the phrase “Big Y,” you’ll find every article, beginning with the most recent that either has those words in the title, the text or as a tag or category.

Go ahead, give it a try. What would you like to learn about?

More Tools – Tags and Categories

Tags and categories help you find relevant information and help search engines find relevant articles when you “Google” for something.

If you scroll down the right-hand sidebar of the blog, you’ll see, in order:

  • Subscription Information
  • Family Tree DNA ad
  • Award Received
  • Recent Posts
  • Archives by date
  • Categories
  • Tags
  • Top Posts and Pages

Bloggers categorize their articles, so if you want to view the articles I’ve categorized as “Acadians” or “Art,” for example, just click on that link.

I use Tags as a more general article categorization. Tags are displayed in alphabetical order with the largest font indicating the tags with the most tagged articles.

You can see that I categorize a lot of articles as Basic Education and General Information. You can click on any tag to read those articles.

My Biggest Surprise

I’ve been asked what’s the most surprising thing that I’ve learned.

I very nearly didn’t publish my 52 Ancestors series because I didn’t think people would be interested in my own family stories about my ancestors and the search that uncovered their history.

Was I ever wrong. Those stories, especially the research techniques, including DNA of course, have been extremely well received. I’ve learned that people love stories.

Thank you for the encouragement. This next week will be the 197th article in that series.

I encourage everyone to find a way to tell the story of your ancestors too. If you don’t, who will?

My Biggest Disappointment

I think my biggest disappointment has been that not enough people utilize the information readily available on the blog. By this, I mean that I see questions on Facebook in multiple groups every day that I’ve already written about and answered – sometimes multiple times in different ways.

This is where you can help. If you see questions like that, please feel free to share the love and post links to any articles. With roughly 12 million testers today and more before year end – there are going to be lots of questions.

Let’s make sure they receive accurate answers.

Sharing

Please feel free to share and post links to any of my articles. That’s the purpose. You don’t need to ask permission.

If you would like to reproduce an article for any reason, please contact me directly.

Most of all, read, enjoy and learn. Encourage others to do so as well. The blog is free for everyone, but any support you choose to give by way of purchasing through affiliate links is greatly appreciated. It doesn’t cost you more, but a few cents comes my way from each purchase through an affiliate link to help support the blog.

What’s Coming?

I have a few articles in process, but I’d like to know what you’d like to see.

Do you have suggestions? Please leave them in the comments.

I’ve love to hear from you and I often write articles inspired by questions I receive.

Subscribe

Don’t miss any articles. If you haven’t already, you can subscribe by entering your e-mail just above the Follow button on the upper right-hand side of the right sidebar.

You can also subscribe via an RSS feed, or follow me on Twitter. You can follow DNAexplain on Facebook, but be aware that Facebook doesn’t show you all of the postings, and you won’t want to miss anything. Subscribing via e-mail is the most reliable option.

Thank You

There’s so much available today – it’s a wonderful time to be a genealogist that’s using DNA. There used to be a difference between a genealogist and a genetic genealogist – but I think we’ve moved past that stage and every genealogist should be utilizing all aspects of DNA (Y, mitochondrial, autosomal and X) as tools.

Thank you for subscribing, following or however you read these articles. You’re an amazing audience. I’ve made the unexpected wonderful discovery that many of you are my cousins as well.

Thanks to you, I’ve unraveled mysteries I never thought would be solved. I’ve visited ancestral homelands as a result of your comments and assistance. I’ve met amazing people. Yes, that means YOU!

I’m extremely grateful. I started this blog to help other people, never imagining how much it would help me too.

I love writing for you, my extended family.

Enjoy and Happy Ancestor Hunting!

_____________________________________________________________________

Standard Disclosure

This standard disclosure appears at the bottom of every article in compliance with the FTC Guidelines.

Hot links are provided to Family Tree DNA, where appropriate. If you wish to purchase one of their products, and you click through one of the links in an article to Family Tree DNA, or on the sidebar of this blog, I receive a small contribution if you make a purchase. Clicking through the link does not affect the price you pay. This affiliate relationship helps to keep this publication, with more than 900 articles about all aspects of genetic genealogy, free for everyone.

I do not accept sponsorship for this blog, nor do I write paid articles, nor do I accept contributions of any type from any vendor in order to review any product, etc. In fact, I pay a premium price to prevent ads from appearing on this blog.

When reviewing products, in most cases, I pay the same price and order in the same way as any other consumer. If not, I state very clearly in the article any special consideration received. In other words, you are reading my opinions as a long-time consumer and consultant in the genetic genealogy field.

I will never link to a product about which I have reservations or qualms, either about the product or about the company offering the product. I only recommend products that I use myself and bring value to the genetic genealogy community. If you wonder why there aren’t more links, that’s why and that’s my commitment to you.

Thank you for your readership, your ongoing support and for purchasing through the affiliate link if you are interested in making a purchase at Family Tree DNA, or one of the affiliate links below:

Affiliate links are limited to:

Concepts: Anonymized Versus Pseudonymized Data and Your Genetic Privacy

Until recently, when people (often relatives) expressed concerns about DNA testing, genetic genealogy buffs would explain that the tester could remain anonymous, and that their test could be registered under another name; ours, for example.

This means, of course, that since our relative is testing for OUR genealogy addiction, er…hobby, that we would take care of those pesky inquiries and everything else. Not only would they not be bothered, but their identity would never be known to anyone other than us.

Let’s dissect that statement, because in some cases, it’s still partially true – but in other cases, anonymity in DNA testing is no longer possible.

You certainly CAN put your name on someone else’s kit and manage their account for them. There are a variety of ways to accomplish this, depending on the testing vendor you select.

If the DNA testing is either Y or mitochondrial DNA, it’s extremely UNLIKELY, if not impossible, that their Y or mitochondrial DNA is going to uniquely identify them as an individual.

Y and mitochondrial DNA is extremely useful in identifying someone as having descended from an ancestor, or not, but it (probably) won’t identify the tester’s identity to any matching person – at least not without additional information.

If you need a brush-up on the different kinds of DNA and how they can be used for genealogy, please read 4 Kinds of DNA for Genetic Genealogy.

Y and mitochondrial DNA can be used to rule in or rule out specific descendant relationships. In other words, you can unquestionably tell for sure that you are NOT related through a specific line. Conversely, you can sometimes confirm that you are most likely related to someone you match through the direct Y (patrilineal) line for males, and matrilineal mitochondrial line for both males and females. That match could be very distant in time, meaning many generations – even hundreds or thousands of years ago.

However, autosomal DNA, which tests a subset of all of your DNA for the genealogical goal of matching to cousins and confirming ancestors is another matter entirely. Some of the information you discern from autosomal testing includes how closely you match, which effectively predicts a range of relationships to your match.

These matches are much more recent in time and do not reach back into the distant past. The more closely you are related, the more DNA you share, which means that your DNA is identifying your location in the family tree, regardless of the name you put on the test itself.

Now, let’s look at the difference between anonymization and pseudonymization.

It may seem trivial, but it isn’t.

Anonymization vs Pseudonymization

Recently, as a result of the European Union GDPR (General Data Protection Regulation,) we’ve heard a lot about privacy and pseudonymization, which is not the same as anonymized data.

Anonymized data must be entirely stripped of any identifiable information, making it impossible to derive insights on a discreet individual, even by the person or entity who performed the anonymization. In other words, anonymization cannot be reversed under any circumstances.

Given that the purpose of genetic genealogy conflicts with the concept of anonymization, the term pseudonymization is more properly applied to the situation where someone masks or replaces the name of the tester with the goal of hiding the identity of the person who is actually taking the test.

Pseudonymization under GDPR (Article 4(5)) is defined as “the processing of personal data in such a way that the data can no longer be attributed to a specific data subject without the use of ‘additional information.’”

In reality, pseudonymization is what has been occurring all along, because the tester could always be re-identified by you.

However, and this important, neither anonymization or pseudonymization can be guaranteed to disguise your identity anymore.

Anonymous Isn’t Anonymous Anymore

The situation with autosomal DNA and the expectation of anonymity has changed rather gradually over the past few years, but with tidal wave force recently with the coming-of-age of two related techniques:

  • The increasingly routine identification of biological parents
  • The Buckskin Girl and Golden State Killer cases in which a victim and suspect were identified in April 2018, respectively, by the same methodology used to identify biological parents

Therefore, with autosomal DNA results, meaning the raw data results file ONLY, neither total anonymity or any expectation of pseudonymization is reasonable or possible.

Why?

The reason is very simple.

The size of the data bases of the combined mainstream vendors has reached the point where it’s unusual, at least for US testers, to not have a reasonably close match with a relative that you did not personally test – meaning third cousin or closer. Using a variety of tools, including in-common-with matches and trees, it’s possible to discern or narrow down candidates to be either a biological parent, a crime victim or a suspect.

In essence, the only real difference between genetic genealogy searching, parent searches and victim/suspect searches is motivation. The underlying technique is exactly the same with only a few details that differ based on the goal.

You can read about the process used to identify the Golden State Killer here, and just a few days later, a second case, the Cook/Van Cuylenborg double homicide cold case in Snohomish County, Washington was solved utilizing the following family tree of the suspect whose DNA was utilized and matched the blue and pink cousins.

Provided by the Snohomish County Sheriff

A genealogist discovering those same matches, of course, would be focused on the common ancestors, not contemporary people or generations.

To identify present day individuals, meaning parents, victims or suspects, the researcher identifies the common ancestor and works their way forward in time. The genealogist, on the other hands, is focused on working backwards in time.

All three types of processes, genealogical, parent identification and law enforcement depend on identifying cousins that lead us to common ancestors.

At that point, the only question is whether we continue working backwards (genealogically) or begin working forwards in time from the common ancestors for either parent identification or law enforcement.

Given that the suspect’s or victim’s name or identifying information is not known, their DNA alone, in combination with the DNA of their matches can identify them uniquely (unless they are an identical twin,) or closely enough that targeted testing or non-genetic information will confirm the identification.

Sometimes, people newly testing discover that a parent, sibling or half sibling genetic match is just waiting for them and absolutely no analysis is necessary. You can read about the discovery of the identity of my brother’s biological family here and here.

Therefore, we cannot represent to Uncle Henry, especially when discussing autosomal DNA testing, that he can test and remain anonymous. He can’t. If there is a family secret, known or unknown to Uncle Henry, it’s likely to be exposed utilizing autosomal DNA and may be exposed utilizing either Y or mitochondrial DNA testing.

For the genealogist, this may cause Pavlovian drooling, but Uncle Henry may not be nearly so enthralled.

In Summary

Genealogical methods developed to identify currently living individuals has obsoleted the concept of genetic anonymity. You can see in the pedigree chart example below how the same match, in yellow, can lead to solving any of the three different scenarios we’ve discussed.

Click to enlarge any graphic

If the tester is Uncle Henry, you might discover that his parents weren’t his parents. You also might discover who his real parents were, when your intention was only to confirm your common great-grandparents. So much for that idea.

A match between Henry and a second cousin, in our example above, can also identify someone involved in a law enforcement situation – although today those very few and far between. Testing for law enforcement purposes is prohibited according to the terms and conditions of all 4 major testing vendors; Ancestry, 23andMe, Family Tree DNA and MyHeritage.

Currently law enforcement kits to identify either victims or suspects can be uploaded at GedMatch but only for violent crimes identified as either homicide or sexual assault, per their terms and conditions.

Furthermore, both 23andMe and Ancestry who previously reserved the right to anonymize your genetic information and sell or otherwise utilize that information in aggregated format no longer can do so under the new GDPR legislation without your specific consent. GDPR, while a huge pain in the behind for other reasons has returned the control of the consumer’s DNA to the consumer in these cases.

The loss of anonymity is the inevitable result of this industry maturing. That’s good news for genetic genealogy. It means we now have lots of matches – sometimes more than we can keep up with!

Because of those matches, we know that if we test our DNA, or that of a family member, our DNA plus the common DNA shared with many of our relatives is enough to identify us, or them. That’s not news to genealogists, but it might be to Uncle Henry, so don’t tell him that he can be anonymous anymore.

You can pseudonymize accounts to some extent by masking Uncle Henry’s name or using your name. Managing accounts for the same reasons of convenience that you always did is just fine! We just need to explain the current privacy situation to Uncle Henry when asking permission to test or to upload his raw data file to GedMatch (or anyplace else,) because ultimately, Uncle Henry’s DNA leads to Uncle Henry, no matter whose name is on the account.

_____________________________________________________________________

Standard Disclosure

This standard disclosure appears at the bottom of every article in compliance with the FTC Guidelines.

Hot links are provided to Family Tree DNA, where appropriate. If you wish to purchase one of their products, and you click through one of the links in an article to Family Tree DNA, or on the sidebar of this blog, I receive a small contribution if you make a purchase. Clicking through the link does not affect the price you pay. This affiliate relationship helps to keep this publication, with more than 900 articles about all aspects of genetic genealogy, free for everyone.

I do not accept sponsorship for this blog, nor do I write paid articles, nor do I accept contributions of any type from any vendor in order to review any product, etc. In fact, I pay a premium price to prevent ads from appearing on this blog.

When reviewing products, in most cases, I pay the same price and order in the same way as any other consumer. If not, I state very clearly in the article any special consideration received. In other words, you are reading my opinions as a long-time consumer and consultant in the genetic genealogy field.

I will never link to a product about which I have reservations or qualms, either about the product or about the company offering the product. I only recommend products that I use myself and bring value to the genetic genealogy community. If you wonder why there aren’t more links, that’s why and that’s my commitment to you.

Thank you for your readership, your ongoing support and for purchasing through the affiliate link if you are interested in making a purchase at Family Tree DNA, or one of the affiliate links below:

Affiliate links are limited to: