Imputation Matching Comparison

In a future article, I’ll be writing about the process of uploading files to DNA.Land and the user experience, but in this article, I want to discuss only one topic, and that’s the results of imputation as it affects matching for genetic genealogy. DNA.Land is one of three companies known positively to be using imputation (DNA.Land, MyHeritage and LivingDNA), and one of two that allows transfers and does matching for genealogy

This is the second in a series of three articles about imputation.

Imputation, discussed in the article, Concepts – Imputation, is the process whereby your DNA that is tested is then “expanded” by inferring results you don’t have, meaning locations that haven’t been tested, by using information from results you do have. Vendors have no choice in this matter, as Illumina, the chip maker of the DNA chip widely utilized in the genetic genealogy marketspace has obsoleted the prior chip and moved to a new chip with only about 20% overlap in the locations previously tested. Imputation is the methodology utilized to attempt to bridge the gap between the two chips for genetic genealogy matching and ethnicity predications.

Imputation is built upon two premises:

1 – that DNA locations are inherited together

2 – that people from common populations share a significant amount of the same DNA

An example of imputation that DNA.Land provides is the following sentence.

I saw a blue ca_ on your head.

There are several letters that are more likely that others to be found in the blank and some words would be more likely to be found in this sentence than others.

A less intuitive sentence might be:

I saw a blue ca_ yesterday.

DNA.Land doesn’t perform DNA testing, but instead takes a file that you upload from a testing vendor that has around 700,000 locations and imputes another 38.3 million variants, or locations, based on what other people carry in neighboring locations. These numbers are found in the SNPedia instructions for uploading DNA.Land information to their system for usage with Promethease.

I originally wrote about Promethease here, and I’ll be publishing an updated article shortly.

In this article, I want to see how imputation affects matching between people for genetic genealogy purposes.

Genetic Genealogy Matching

In order to be able to do an apples to apples comparison, I uploaded my Family Tree DNA autosomal file to DNA.Land.

DNA.Land then processed my file, imputed additional values, then showed me my matches to other people who have also uploaded and had additional locations imputed.

DNA.Land has just over 60,000 uploads in their data base today. Of those, I match 11 at a high confidence level and one at a speculative level.

My best match, meaning my closest match, Karen, just happened to have used her GedMatch kit number for her middle name. Smart lady!

Karen’s GedMatch number provided me with the opportunity to compare our actual match information at DNA.Land, then also at GedMatch, then compare the two different match results in order to see how much of our matching was “real” from portions of our tested kits that actually match, and what portion of our DNA matches as a result of the DNA.Land imputation.

At DNA.Land, your match information is presented with the following information:

  • Relationship degree – meaning estimated relationship
  • # shared segments – although many of these are extremely small
  • Total shared cM
  • Total recent shared length in cM
  • Longest recent shared segment in cM
  • Relationship likelihood graph
  • Shared segments plotted on chromosome display
  • Shared segments in a table

Please note that you can click on any graphic to enlarge.

DNA.Land provides what they believe to be an accurate estimate of recent and anciently shared SNA segments.

The match table is a dropdown underneath the chromosome graphic at far right:

For this experiment, I copied the information from the match table and dropped it into a spreadsheet.

DNALand Match Locations

My match information is shown at DNA.Land with Karen as follows:

Matching segments are identified by DNA.Land as either recent or ancient, which I find to be over-simplified at best and misleading or inaccurate at worst. I guess it depends on how you perceive recent and ancient. I think they are trying to convey the concept that larger segments tend to me more recent, and smaller segments tend to be older, but ancient in the genetics field often refers to DNA extracted from exhumed burials from thousands of years ago.  Furthermore, smaller segments can be descended from the same ancestor as larger segments.

GedMatch Match

Since Karen so kindly provided her GedMatch kit number, I signed in to GedMatch and did a one-to-one match with this same kit.

Since all of the segments are 3 cM and over at DNA.Land, I utilized a GedMatch threshold of 3 cM and dropped the SNP count to 100, since a SNP count of 300 gave me few matches. For this comparison, I wanted to see all my matches to Karen, no matter how few SNPs are involved, in an attempt to obtain results similar to DNA.Land. I normally would not drop either of these thresholds this low. My typical minimum is 5cM and 500 SNPs, and even if I drop to 3cM, I still maintain the 500 SNP threshold.

Let’s see how the data from GedMatch and DNA.Land compares.

In my spreadsheet, below, I pasted the segment match information from DNA.Land in the first 5 columns with a red header. Note that DNA.Land does not provide the number of shared SNPs.

At right, I pasted the match information from GedMatch, with a green header. We know that GedMatch has a history of accurately comparing segments, and we can do a cross platform comparison. I originally uploaded my FTDNA file to DNA.Land and Karen uploaded an Ancestry file. Those are the two files I compared at GedMatch, because the same actual matching locations are being compared at both vendors, DNA.Land (in addition to imputed regions) and GedMatch.

I then copied the matching segments from GedMatch (3cM, 100 SNPs threshold) and placed them in the middle columns in the same row where they matched corresponding DNA.Land segments. If any portion of the two vendors segments overlapped, I copied them as a match, although two are small and partial and one is almost negligible. As you can see, there are only 10 segments with any overlap at all in the center section. Please note that I am NOT suggesting these are valid or real matches.  At this point, it’s only a math/match exercise, not an analysis.

The match comparison column (yellow header) is where I commented on the match itself. In some cases, the lack of the number of SNPs at DNA.Land was detrimental to understanding which vendor was a higher match. Therefore, when possible, I marked the higher vendor in the Match Comparison column with the color of their corresponding header.

Analysis

Frankly, I was shocked at the lack of matching between GedMatch and DNA.Land. Trying to understand the discrepancy, I decided to look at the matches between Karen, who has been very helpful, and me at other vendors.

I then looked at our matches at Ancestry, 23andMe, MyHeritage and at Family Tree DNA.

The best comparison would be at Family Tree DNA where Karen loaded her Ancestry file.  Therefore, I’m comparing apples to apples, meaning equivalent to the comparison at GedMatch and DNA.Land (before imputation).

It’s impossible to tell much without a chromosome browser at Ancestry, especially after Timber processing which reduces matching DNA.

DNA.Land categorized my match to Karen as “high certainty.” My match with Karen appears to be a valid match based on the longest segment(s) of approximately 30cM on chromosome 8.

  • Of the 4 segments that DNA.Land identifies as “recent” matches, 2 are not reflected at all in the GedMatch or Family Tree DNA matching, suggesting that these regions were imputed entirely, and incorrectly.
  • Of the 4 segments that DNA.Land identifies as “recent” matches, the 2 on chromosome 8 are actually one segment that imputation apparently divided. According to DNA.LAND, imputation can increase the number of matching segments. I don’t think it should break existing segments, meaning segments actually tested, into multiple pieces. In any event, the two vendors do agree on this match, even though DNA.Land breaks the matching segment into two pieces where GedMatch and Family Tree DNA do not. I’m presuming (I hate that word) that this is the one segment that Ancestry calls as a match as well, because it’s the longest, but Ancestry’s Timber algorithm downgrades the match portion of that segment by removing 11cM (according to DNA.Land) from 29cM to 18cM or removes 13cM (according to both GedMatch and Family Tree DNA) from 31cM to 18cM. Both GedMatch and Family Tree DNA agree and appear to be accurate at 31cM.
  • Of the total 39 matching segments of any size, utilizing the 3cM threshold and 100 SNPs, which I set artificially very low, GedMatch only found 10 matching segments with any portion of the segment in common, meaning that at least 29 were entirely erroneous matches.
  • Resetting the GedMatch match threshold to 3 cM and 300 SNPS, a more reasonable SNP threshold for 3cM, GedMatch only reports 3 matching segments, one of which is chromosome 8 (undivided) which means at this threshold, 36 of the 39 matching DNA.Land segments are entirely erroneous. Setting the threshold to a more reasonable 5cM or 7cM and 500 SNPs would result in only the one match on chromosome 8.

  • If 29 of 39 segments (at 3cM 100 SNPs) are erroneously reported, that equates to 74.36% erroneous matches due to imputation alone, with out considering identical by chance (IBC) matches.
  • If 35 of 39 segments (at 3cM 300 SNPs) are erroneously reported, that equates to 89.74% percent erroneous matches, again without considering those that might be IBC.

Predicted vs Actual

One additional piece of information that I gathered during this process is the predicted relationship.

Vendor Total cM Total Segments Longest Segment Predicted Relationship
DNA.Land 162 to 3 cM 39 to 3 cM 17.3 & 12, split 3C
GedMatch 123 to 3 cM 27 to 3 cM 31.5 5.1 gen distant
Family Tree DNA 40 to 1 cM 12 to 1 cM 32 3-5C
MyHeritage No match No match No match No match
Ancestry 18.1 1 18.1 5-8C
23andMe 26 1 26 3-6C

Karen utilized her Ancestry file and I used my Family Tree DNA file for all of the above matching except at 23andMe and Ancestry where we are both tested on the vendors’ platform. Neither 23andMe nor Ancestry accept uploads. I included the 23andMe and Ancestry comparisons as additional reference points.

The lack of a match at MyHeritage, another company that implements imputation, is quite interesting. Karen and I, even with a significantly sized segment are not shown as a match at MyHeritage.

If imputation actually breaks some matching segments apart, like the chromosome 8 segment at DNA.Land, it’s possible that the resulting smaller individual segments simply didn’t exceed the MyHeritage matching threshold. It would appear that the MyHeritage matching threshold is probably 9cM, given that my smallest segment match of all my matches at MyHeritage is 9cM. Therefore, a 31 or 32 cM segment would have to be broken into 4 roughly equally sized pieces (32/4=8) for the match to Karen not to be detected because all segment pieces are under 9cM. MyHeritage has experienced unreliable matching since their rollout in mid 2016, so their issue may or may not be imputation related.

The Common Ancestor

At Family Tree DNA, Karen does not match my mother, so I can tell positively that she is related through my father’s line. She and I triangulate on our common segment with three other individuals who descend from Abraham Estes 1647-1720 .

Utilizing the chromosome browser, we do indeed match on chromosome 8 on a long segment, which is also our only match over 5cM at Family Tree DNA.

Based on our trees as well as the trees of our three triangulated Estes matches, Karen and I are most probably either 8th cousins, or 8th cousins once removed, assuming that is our only common line. I am 8th cousins with the other three triangulated matches on chromosome 8. Karen’s line has yet to be proven.

Imputation Matching Summary

I like the way that DNA.Land presents some of their features, but as for matching accuracy, you can view the match quality in various ways:

  1. DNA.Land did find the large match on chromosome 8. Of course, in terms of matching, that’s pretty difficult to miss at roughly 30cM, although MyHeritage managed. Imputation did split the large match into two, somehow, even though Karen and I match on that same segment as one segment at other vendors comparing the same files.
  2. Of the 39 DNA.Land total matches, other than the chromosome 8 match, two other matches are partial matches, according to GedMatch. Both are under 7cM.
  3. Of DNA.Land’s total 39 matches, 35 are entirely wrong, in addition to the two that are split, including two inaccurate imputed matches at over 5cM.
  4. At DNA.Land, I’m not so concerned about discerning between “real” and “false” small segment matches, as compared to both FTDNA and GedMatch, as I am about incorrectly imputed segments and matches. Whether small matches in general are false positives or legitimate can be debated, each smaller segment match based on its own merits. Truthfully, with larger segments to deal with, I tend to ignore smaller segments anyway, at least initially. However, imputation adds another layer of uncertainty on top of actual matching, especially, it appears, with smaller matches. Imputing entire segments of incorrect DNA concerns me.
  5. Having said that, I find it very concerning that MyHeritage who also utilizes imputation missed a significant match of over 30cM. I don’t know of a match of this size that has ever been proven to be a false match (through parental phasing), and in this case, we know which ancestor this segment descends from through independent verification utilizing multiple other matches. MyHeritage should have found that match, regardless of imputation, because that match is from portions of the two files that were both tested, not imputed.

Summary

To date, I’m not impressed with imputation matching relative to genetic genealogy at either DNA.Land or MyHeritage.

In one case, that of DNA.Land, imputation shows matches for segments that are not shown as matches at either Family Tree DNA or GedMatch who are comparing the same two testers’ files, but without imputation. Since DNA.Land did find the larger segment, and many of their smaller segments are simply wrong, I would suggest that perhaps they should only show larger segments. Of course, anyone who finds DNA.Land is probably an experienced genetic genealogist and probably already has files at both GedMatch and Family Tree DNA, so hopefully savvy enough to realize there are issues with DNA.Land’s matching.

In the second imputation case, that of MyHeritage, the match with Karen is missed entirely, although that may not be a function of imputation. It’s hard to determine.  MyHeritage is also comparing the same two files uploaded by Karen and I to the other vendors who found that match, both vendors who do and don’t utilize imputation.

Regardless of imputing additional locations, MyHeritage should have found the matching segment on chromosome 8 because that region does NOT need to be imputed. Their failure to do so may be a function of their matching routine and not of imputation itself. At this point, it’s impossible to discern the cause. We only know, based on matching at other vendors, that the non-match at MyHeritage is inaccurate.

Here’s what DNA.Land has to say about the imputed VCF file, which holds all of your imputed values, when you download the file. They pull no punches about imputation.

“Noisey and probabilistic.” Yes, I’d say they are right, and problematic as well, at least for genetic genealogists.

Extrapolating this even further, I find it more than a little frightening that my imputed data at DNA.Land will be utilized for medical research.

Quoting now from Promethease, a medical reference site that allows the consumer to upload their raw data files, providing consumers with a list of SNPs having either positive or negative research in academic literature:

DNA.land will take a person’s data as produced by such companies and impute additional variants based on population frequency statistics. To put this in concrete terms, a person uploading a typical 23andMe file of ~700,000 variants to DNA.land will get back an (imputed) file of ~39 million variants, all predicted to be present in the person. Promethease reports from such imputed files typically contain about 50% more information (i.e. 50% more genotypes) than the corresponding reports from raw (non-imputed) data.

Translated, this means that your imputed data provides twice as much “genetic information” as your actual tested data. The question remains, of course, how much of this imputed data is accurate.

That will be the topic of the third imputation article. Stay tuned.

_____________________________________________________________________

Standard Disclosure

This standard disclosure appears at the bottom of every article in compliance with the FTC Guidelines.

Hot links are provided to Family Tree DNA, where appropriate. If you wish to purchase one of their products, and you click through one of the links in an article to Family Tree DNA, or on the sidebar of this blog, I receive a small contribution if you make a purchase. Clicking through the link does not affect the price you pay. This affiliate relationship helps to keep this publication, with more than 850 articles about all aspects of genetic genealogy, free for everyone.

I do not accept sponsorship for this blog, nor do I write paid articles, nor do I accept contributions of any type from any vendor in order to review any product, etc. In fact, I pay a premium price to prevent ads from appearing on this blog.

When reviewing products, in most cases, I pay the same price and order in the same way as any other consumer. If not, I state very clearly in the article any special consideration received. In other words, you are reading my opinions as a long-time consumer and consultant in the genetic genealogy field.

I will never link to a product about which I have reservations or qualms, either about the product or about the company offering the product. I only recommend products that I use myself and bring value to the genetic genealogy community. If you wonder why there aren’t more links, that’s why and that’s my commitment to you.

Thank you for your readership, your ongoing support and for purchasing through the affiliate link if you are interested in making a purchase at Family Tree DNA.

Family Tree DNA Sale, MyHeritage Transfers and Hurricane Fundraiser

As many of you know, the owners of Family Tree DNA, a Houston company, have committed a percentage of their sales during the month of September for donation to hurricane Harvey disaster relief efforts. A daily running total is displayed at the top of their page.

I think they will top $20,000 today!

I know that with two more hurricanes (Irma an Maria) and two earthquakes in Mexico, Harvey, which ravaged Texas less than 3 weeks ago seems like old news. It’s not to the families whose lives have been upended and who have lost everything, not only due to the winds of the hurricane along the coast, but unprecedented flooding in Houston for the following week. Those families are still cleaning mud out of their homes, ripping off their sheetrock, and so much more. Thousands are displaced and have lost everything.

The best part about the Family Tree DNA fundraiser is that you can contribute to the relief effort without any additional cost to you. In fact, there’s a lot of benefit to everyone – you benefit when you order a test or upgrade, other people whose genealogy may depend on your testing benefit, and the families trying to recover from Harvey benefit as well. You never know, maybe the person you desperately need to knock down a brick wall will test or transfer now!

Everyone wins! But you only have another week, so don’t wait.

Family Tree DNA just sweetened the deal in three ways too.

Deal Sweeteners

MyHeritage Transfer

Family Tree DNA has just added MyHeritage as a transfer partner, meaning if you tested at MyHeritage, you can transfer your results to Family Tree DNA and see matches for free.

The autosomal DNA transfer option for MyHeritage as well as other vendors can be found, here, in the upper left hand corner of the main Family Tree DNA page, under DNA Tests.

Family Tree DNA accepts transfers from:

  • Ancestry
  • 23andMe V3 and V4
  • MyHeritage

Family Finder Just $69

The Family Finder autosomal test is on sale now for $69, a $20 savings. If you haven’t tested yet, or have transferred the 23andMe V4 or Ancestry V2 tests which only provide your closest matches, and not the more distant ones (due to chip incompatibility), now is a great time to order a Family Finder test. I don’t know how long the sale price lasts, so if you’re interested, buy now.

Unlock All Transfer Features Just $10

In addition, Family Tree DNA has dropped the price of unlocking the full suite of autosomal tools available after the free transfer of your results. You receive your matches for free, but by adding the $10 unlock, on sale reduced from the regular $19 until the end of September, you add three features:

  • Chromosome Browser
  • myOrigins (ethnicity)
  • ancientOrigins.

You will need a coupon code, so you can use mine. These codes are NOT limited to one use only, so please feel free to upgrade as many tests as you wish.

USE CODE: ATUL0917

Here’s what the unlock gives you access to, in addition to your free matches.

Transferring and Unlocking is Easy…

  • Click here to upgrade, unlock (ATUL0917) or transfer your results from another vendor.
  • Then sign on to your own account to transfer, unlock or upgrade if you already have an account at Family Tree DNA.
  • If you don’t currently have an account at Family Tree DNA, click in the upper left hand corner of the page you’ll see to set up an account and transfer your DNA file from another vendor. Then use the use code (ATUL0917) to unlock all the features for just $10.

It’s that easy and you’ll be helping others too!

______________________________________________________________________

Standard Disclosure

This standard disclosure appears at the bottom of every article in compliance with the FTC Guidelines.

Hot links are provided to Family Tree DNA, where appropriate. If you wish to purchase one of their products, and you click through one of the links in an article to Family Tree DNA, or on the sidebar of this blog, I receive a small contribution if you make a purchase. Clicking through the link does not affect the price you pay. This affiliate relationship helps to keep this publication, with more than 850 articles about all aspects of genetic genealogy, free for everyone.

I do not accept sponsorship for this blog, nor do I write paid articles, nor do I accept contributions of any type from any vendor in order to review any product, etc. In fact, I pay a premium price to prevent ads from appearing on this blog.

When reviewing products, in most cases, I pay the same price and order in the same way as any other consumer. If not, I state very clearly in the article any special consideration received. In other words, you are reading my opinions as a long-time consumer and consultant in the genetic genealogy field.

I will never link to a product about which I have reservations or qualms, either about the product or about the company offering the product. I only recommend products that I use myself and bring value to the genetic genealogy community. If you wonder why there aren’t more links, that’s why and that’s my commitment to you.

Thank you for your readership, your ongoing support and for purchasing through the affiliate link if you are interested in making a purchase at Family Tree DNA.

Glossary – DNA – Deoxyribonucleic Acid

What is DNA and why do I care?

Good questions. Let’s take a look at the answer in general, then why we use DNA for genealogy.

The Recipe for You

DNA, deoxyribonucleic acid, is the book of life for all organisms. In essence, it’s the recipe for you – and what makes you unique.

DNA is formed of strands that twist to form the familiar double helix pattern.

The two strands are joined together by one of 4 different nucleotides, one extending from each side to connect in the middle. The nucleotides are:

  • Cytosine – C
  • Guanine – G
  • Thymine – T
  • Adenine – A

The nucleotide names don’t really matter for genetic genealogy, but what does matter is that the sequence of these nucleotides when chained together is what encodes information on long structures called chromosomes. Each person carries 22 chromosomes, plus the 23rd chromosome pair which is gender specific.

Using DNA for Genetic Genealogy

There are four different kinds of DNA that genealogists use in different ways for obtaining ancestors’ information relevant to genetic genealogy. Thankfully, we have 4 different kinds of DNA available to us because of unique inheritance patterns for each kind of DNA – meaning we inherited different kinds of DNA from different ancestral paths. If one kind of DNA doesn’t work in a particular situation, chances are good that another type will.

Genetic genealogy makes use of 4 different types of DNA.

  • Y DNA – passed from males to male children, only (your father’s paternal line)
  • Mitochondrial DNA – passed from females to both genders of children, but only females pass it on (your mother’s matrilineal line)

Y and mitochondrial DNA inheritance paths are shown on a pedigree chart in the graphic below, with the blue boxes representing Y DNA and the red circles representing mitochondrial DNA inheritance.

In addition to Y and mitochondrial DNA, genetic genealogists also use two kinds of DNA that reflect inheritance from additional ancestral lines, in addition to the red and blue lines shown above – meaning the ancestral lines with no color.

  • Autosomal DNA – the 22 chromosomes that recombine during reproduction.
  • X Chromosome – always contributed by the mother, but only contributed by the father to female children – this is the 23rd chromosome pair which recombines with a unique inheritance pattern.  You can read more about that in the article, X Marks the Spot.

Receiving What Kind of DNA from Whom

While the Y and mitochondrial DNA have unique and very prescribed inheritance patterns as shown by the red arrows pointing to the blue Y chromosome below at far left, and the red mitochondrial circles at far right, the 22 autosomal chromosomes are contributed equally by each parent. In other words, for each chromosome, a child inherits half of each parent’s DNA. How the selection of which DNA is contributed to each child is unknown.

A child’s gender is determined by the parent’s contributions to the 23rd chromosome, not shown above. The following chart explains gender determination by the X and Y combinations of the 23rd chromosome.

Received from Mother Received from Father
Male child X Y
Female child X X

The Y chromosome is what makes males male.

No Y chromosome?  You’re a female.

However, this X chromosome inheritance pattern provides us with the ability to look at X matches for males and know immediately that they had to have come from his mother’s lineage – because males don’t inherit an X chromosome from their father.

Autosomal DNA and Genetic Genealogy

The 22 non-gender chromosomes recombine in each generation, with half of each chromosome being contributed by each parent, as shown in the illustrations above.

You can see that in the first generation, the child received one blue and one yellow, or one pink and one green, chromosome. In giving each child exactly half of their DNA, each parent contributes some amount of ancestral DNA from generations upstream, as you can see in the mother/father and son/daughter generations.

For example, each child receives, on average, 25% of each of their grandparent’s DNA – although they can receive somewhat more or less than 25%, depending on the random nature of recombination.

Therefore, genetic genealogy testing companies compare tester’s autosomal DNA with other testers and look for common segments contributed by common ancestors, resulting in autosomal matching.

When relatively large segments match between three or more relatives who are not immediate family, we can attribute that DNA to a common ancestor. Of course, the challenge, and the thrill, is to determine which common ancestor contributed that common DNA to our triangulated match group. It’s a great way to verify our research and to break down brick walls.

Let’s face it, you received ALL of your DNA from SOME combination of ancestors, and if you carry large enough pieces from any specific ancestor, we can, hopefully, identify the source of that DNA segment by looking at the genealogy of those we match on that segment.

It’s a great puzzle to unravel, and best of all, it’s the puzzle of you.

More Info

The great news is that you can utilize your Y DNA, mitochondrial DNA and autosomal DNA differently, to provide you with different kinds of information about different ancestors and genealogy lines.

If you’d like to read more about how the 4 Kinds of DNA can be used, please read the short article, 4 Kinds of DNA for Genetic Genealogy.

You can also enter any word or phrase into the search box in the upper right hand corner of this blog to find additional useful information about any topic.

If You Want to Test

If you’d like to learn more about the various kinds of DNA tests available, and which one or ones would be the best for you, please read the article, Which DNA Test is Best?

Right now, the Y DNA, mitochondrial and autosomal (Family Finder) tests are on sale at Family Tree DNA, through the end of August, 2017.

______________________________________________________________________

Standard Disclosure

This standard disclosure will now appear at the bottom of every article in compliance with the FTC Guidelines.

Hot links are provided to Family Tree DNA, where appropriate. If you wish to purchase one of their products, and you click through one of the links in an article to Family Tree DNA, or on the sidebar of this blog, I receive a small contribution if you make a purchase. Clicking through the link does not affect the price you pay. This affiliate relationship helps to keep this publication, with more than 850 articles about all aspects of genetic genealogy, free for everyone.

I do not accept sponsorship for this blog, nor do I write paid articles, nor do I accept contributions of any type from any vendor in order to review any product, etc. In fact, I pay a premium price to prevent ads from appearing on this blog.

When reviewing products, in most cases, I pay the same price and order in the same way as any other consumer. If not, I state very clearly in the article any special consideration received. In other words, you are reading my opinions as a long-time consumer and consultant in the genetic genealogy field.

I will never link to a product about which I have reservations or qualms, either about the product or about the company offering the product. I only recommend products that I use myself and bring value to the genetic genealogy community. If you wonder why there aren’t more links, that’s why and that’s my commitment to you.

Thank you for your readership, your ongoing support and for purchasing through the affiliate link if you are interested in making a purchase at Family Tree DNA.

Using Spousal Surnames and DNA to Unravel Male Lines

When Y DNA matching at Family Tree DNA, it’s not uncommon for men to match other males of the same surname who share the same ancestor. In fact, that’s what we hope for, fervently!

However, if you’re stuck downstream, you may need to figure out which of several male children you descend from.

If you’re staring at a brick wall working yourselves back in time, you may need to try working forward, utilizing various types of information, including wives’ surnames.

For all intents and purposes, this is my Vannoy line, in Wilkes County, NC, so let’s use it as an example, because it embodies both the promise and the peril of this approach.

So, there you sit, disconnected from the Vannoy line. That little yellow box is just so depressing. So close, but yet so far. And yes, we’ve already exhausted the available paper trail records, years ago.

We know the lineage back through Elijah Vannoy, who was born between 1784-1786 in Wilkes County, or vicinity. We know my Vannoy cousin Y DNA matches with other men from the Vannoy line upstream of John Francis Vannoy, the known father of four sons in Wilkes County, NC and the first (and only) Vannoy to move from New Jersey to that part of North Carolina.

Therefore, we know who the candidates are to be Elijah’s father, but the connection in the yellow box is missing. Many Wilkes County records have gone missing over the years and births were not recorded in that timeframe.  The records from neighboring Ashe County where Daniel Vannoy lived burned during the Civil War, although some records did survive. In other words, the records are rather like Swiss cheese. Welcome to genealogy in the south.

Which of John Francis Vannoy’s four sons does Elijah descend from?

Let’s see what we can discover.

Contact Matches and Ask for Help

The first thing I would do is to ask for assistance from your surname matches.

Let’s say that you match a known descendant of each of these four men, meaning each of John Francis Vannoy’s sons. Ask each person if they know where the male Vannoy descendants of each son went along with any documentation they might have. If your ancestor, Elijah in this case, is not found in the same location as the sons, geography may be your friend.

In our case, we know that Francis Vannoy migrated to Knox County, Kentucky, but that was after he signed for his daughter’s marriage in Wilkes Co., NC in 1812. It was also about this time that Elijah Vannoy migrated to Claiborne County, TN, in the same direction, but not the same location. The two locations are an hour away by car today, separated by mountains and the Cumberland Gap, a nontrivial barrier.

We also know that Nathaniel Vannoy left a Bible that did not list Elijah as one of his children, but with a gap large enough to possibly encompass another child.  If you’re thinking to yourself, “Who would leave a child’s birth out of the Bible?,” I though the same thing until I encountered it myself personally in another line.  However, the Bible record does make Nathaniel a less likely father candidate, despite a persistent rumor that Nathaniel was Elijah’s father.

Our only other clues are some tax records recording the number of children in the household of various ages, but none are conclusive. None of these men had wills.

Y DNA Genetic Distance

Your Y DNA matches will show how many mutations you are from them at a particular marker level.

Please note that you can click to enlarge any graphic.

The number of mutations between two men is called the genetic distance.

The rule of thumb is that the more mutations, the further back in time the common ancestor. The problem is, the rule of thumb doesn’t always work. DNA mutates when it darned well pleases, not on any clock that we can measure with that degree of accuracy – at least not accurately enough to tell which of 4 sons a man descends from – unless that line has incurred a defining mutation between the ancestor and the current generation. We call those line marker mutations. To determine the mutation history, you need multiple men from each line to have tested.

You can read more about Y DNA matching in the article, Concepts – Y DNA Matching and Connecting with your Paternal Ancestor.

Check Autosomal DNA Tests

Next, check to see if your Y DNA matches from all Vannoy lines have also taken the autosomal Family Finder test, noted as FF, which shows matches from all ancestral lines, not just the paternal line.

You can see in the match list above that not many have taken the Family Finder test. Ask if they would be willing to upgrade. Be prepared to pay if need be – because you are, after all, the one with the “problem” to solve.

Generally, I simply offer to pay. It’s well worth it to me, and given that paper records don’t exist to answer the question – a DNA test under $100 is cheap. Right now, Family Finder tests are on sale for $69 until the end of the month.

Check for Intermarriage

While you’re waiting for autosomal DNA results, check the pedigrees for all for lines involved to see if you are otherwise related to these men or their wives.

For example, in Andrew Vannoy’s wife’s line and Elijah Vannoy’s wife’s line, we have a common ancestor. George Shepherd and Elizabeth Mary Angelique Daye are common to both lines, and John Shepherd’s wife is unknown, so we have one known problem and one unknown surname.

You can tell already that this could be messy, because we can’t really use Andrew Vannoy’s wife’s line to search for matches because Elijah’s line is likely to match through Andrew’s wife since Susannah Shepherd and Lois McNiel share a common lineage. Rats!

We’ll mark these in red to remind ourselves.

Check Advanced Matching

Family Tree DNA provides a wonderful tool that allows you to compare matches of different kinds of DNA. The Advanced Matching tab is found under “Tools and Apps” under the myFTDNA tab at the upper left.

In this case, I’m going to use the Advanced Match feature to see which of my Vannoy cousin’s Y matches at 37 markers, within the Vannoy DNA project, also match him autosomally.

This report is particularly nice, because it shows number of Y mutations, often indicating distance to a common ancestor, as well as the estimated autosomal relationship range.

You can see in this case that the first Vannoy male, “A,” is a close match both on Y DNA and autosomally, with 1 mutation difference and falling in the 2nd to 4th cousin range, as compared to the second Vannoy male, “D,” who is 3 mutations different and falls into the 4th to remote cousin range.

Not every Vannoy male may have joined the Vannoy project, so you’ll want to run this report a second time, replacing the Vannoy project search criteria with “The Entire Database.”

Unfortunately, not everyone that I need has taken the Family Finder test, so I’ll be contacting a few men, asking if I can sponsor their upgrades.

Let’s move on to our next tactic, using the wives’ surnames.

Search Utilizing the Wife’s Surname

We already know that we can’t rely on the Shepherd surname, so we’ll have to utilize the surnames of the other three wives:

  • Millicent Henderson – parents Thomas Henderson born circa 1730 Virginia, died 1806 Laurens, SC, wife Frances, surname unknown
  • Elizabeth Ray (Raye) – parents William Ray born circa 1725/1730 Herdford, England, died 1783 Wilkes Co., NC (the portion now Ashe Co.,) wife Elizabeth Gordon born circa 1783 Amherst Co., VA and died 1804 Surry Co., NC
  • Sarah Hickerson – parents Charles Hickerson born circa 1725 Stafford Co., VA, died before 1793 Wilkes Co., NC, wife Mary Lytle

Utilizing the Family Finder match search function, I’m going to search for matches that include the wives surnames, but are NOT descended from the Vannoy line.

Hickerson produced no non-Vannoy matches utilizing the matches of my first Vannoy cousin, but Henderson is another matter entirely.

Since the Henderson line would be on my cousin’s father’s side, the matches that are most relevant are the ones phased to his paternal line, those showing the blue person icon.

The surname that you have entered as the search criteria will show as blue in the Ancestral Surname list, at far right, and other matching surnames will show as black. Please note that this includes surnames from ANY person in the match’s tree if they have uploaded a Gedcom file, not just surnames of direct ancestral lines. Therefore, if the match has a tree, it’s important to click on the pedigree icon and search for the surname in question. Don’t assume.

Altogether, there are 76 Henderson matches, of which 17 are phased to his paternal line. You’ll need to review each one of at least the 17. Personally, I would painstakingly review each one of the 76. You never know where a shred of information will be found.

Please note, finding a match with a common surname DOES NOT MEAN THAT YOU MATCH THIS PERSON THROUGH THAT SURNAME. Even finding a person with a common ancestor doesn’t mean that you both descend from that ancestor. You may have a second common ancestor. It means that you have more work to do, as proof, but it’s the beginning you need.

Of course, the first thing we need to do is eliminate any matches who also descend from a Vannoy, because there is no way to know if the matching DNA is through the Vannoy or Henderson lines. However, first, take note of how that person descends from the Vannoy line.

You can see your matches entire surname list by clicking on their profile picture.

The surname, Ray, is more difficult, because the search for Ray also returns names like Bray and Wray, as well as Ray.

But Wait – There’s a Happy Ending!

If you’re thinking, “this is a lot of work,” yes, it is.

Yes, you are absolutely going to do the genealogy of the wives’ lines so you can recognize if and how your matches might connect.

I enter the wives’ lines into my genealogy software and then I search for the ancestors found in my matches trees to see if they descend from that line.

One tip to make this easier is to test multiple people in the same line – regardless of whether they are males or carry the desired surname. They simply need to be descendants – that’s the beauty of autosomal DNA and why I carry kits with me wherever I go.  And yes, I’m really serious about that!

When you have multiple testers from the same line, you can utilize each test independently, searching for each surname in the Family Finder results.  Then, from the surname match list, select a sibling or other close relative with that same surname in their list, then choose the ICW feature. This allows you to see who both of those people match who also carries the Henderson surname in their surname list.

Not successful with that initial cousin’s match results – like I wasn’t with Hickerson?

Rinse and repeat, with every single person who you can find who has descended from the line in question. I started the process over again with a second cousin and a Hickerson search.

About the time you’re getting really, really tired of looking at all of those trees, extending the branches of other people’s lines, and are about to give up and go to bed because it’s 3 AM and you’re discouraged, you see something like this:

Yep, it’s good old Charles Hickerson and Mary Lytle.  I could hardly believe my eyes!!! This Hickerson match to a cousin in my Vannoy line descends from Charles Hickerson’s son, Joshua.

All of a sudden…it’s all worthwhile! Your fatigue is gone, replaced by adrenalin and you couldn’t sleep now if your life depended on it!

Using the ICW (in common with feature) to find additional known cousins who match the person with Charles Hickerson and Mary Lytle in their tree, I found a total of three Vannoy cousins with significant matches.

Using the chromosome browser to compare, I’ve confirmed that one segment is a triangulated match of 12.69 cM (blue) on chromosome 2.

You can read more about triangulation in the article, Concepts – Why Genetic Genealogy and Triangulation? as well as the article, Concepts – Match Groups and Triangulation.

Do I wish I had more than three people in my triangulation group? Yes, of course, but with a match of this size triangulated between cousins and a Hickerson descendant who is a 30 year genealogist, sporting a relatively complete tree and no other common lines, it’s a great place to begin digging deeper! This isn’t the end, but a new beginning!

After obsessively digging through the matches of every Elijah Vannoy descended cousin I can find (sleep is overrated anyway) and whose account I have access to, I have now discovered matches with four additional people who have no other common lines with the Vannoy cousins and who descend from Charles Hickerson and Mary Lytle through sons David and Joseph Hickerson. I can’t tell if they triangulate without access to accounts that I don’t have access to, so I’ve sent e-mails requesting additional information.

WooHoo Happy Day!!! There’s a really big crack in the brick wall and I’ve just witnessed the sunrise of a beautiful, amazing day.

I think Elijah’s parents are…drum roll…Daniel Vannoy and Sarah Hickerson!

Which walls do you need to fall and how can you use this technique?

______________________________________________________________________

Standard Disclosure

This standard disclosure will now appear at the bottom of every article in compliance with the FTC Guidelines.

Hot links are provided to Family Tree DNA, where appropriate. If you wish to purchase one of their products, and you click through one of the links in an article to Family Tree DNA, or on the sidebar of this blog, I receive a small contribution if you make a purchase. Clicking through the link does not affect the price you pay. This affiliate relationship helps to keep this publication, with more than 850 articles about all aspects of genetic genealogy, free for everyone.

I do not accept sponsorship for this blog, nor do I write paid articles, nor do I accept contributions of any type from any vendor in order to review any product, etc. In fact, I pay a premium price to prevent ads from appearing on this blog.

When reviewing products, in most cases, I pay the same price and order in the same way as any other consumer. If not, I state very clearly in the article any special consideration received. In other words, you are reading my opinions as a long-time consumer and consultant in the genetic genealogy field.

I will never link to a product about which I have reservations or qualms, either about the product or about the company offering the product. I only recommend products that I use myself and bring value to the genetic genealogy community. If you wonder why there aren’t more links, that’s why and that’s my commitment to you.

Thank you for your readership, your ongoing support and for purchasing through the affiliate link if you are interested in making a purchase at Family Tree DNA.

The Unexpected Bounty of DNA Testing – Friends and Family of Heart

Bill and Sandie Lakner, with me in the middle.

When I first started with genetic genealogy in the year 2000, I was interested in proving (or disproving) specific stories about my Estes ancestors as well as learning more about as many family lines as I could.

I hoped that I would meet new cousins that perhaps would have information that I don’t, and who would be willing to share.

What I never imagined, and I almost hate to admit this, is that I’d find a whole new group of friends.

I have always been a rather solitary researcher, in part because I don’t live anyplace near where my ancestors did. There are no records where I live for what I need to research, so the local genealogy societies hold little allure for me. In fact, in my state, I AM the immigrant, more or less. The ‘more or less” part of that comment will have to wait for another day and has to do with my father being stationed nearby in the military.

Several years ago, when autosomal DNA was added to the genetic genealogists menu, I began to hear from LOTS and LOTS of people. In fact, so many that one of the reasons I introduced my blog and began to write educational articles was as a form of self-defense. Between the blog and the projects I administer at Family Tree DNA, I found myself answering the same questions over and over again, so writing a nice article with graphics where I could refer people seemed like a great idea. Never did I imagine the blog would actually increase the amount of communications, but it did!

It’s hard for me to believe I’ve been doing this for 17 years now, almost half of my adult life. I’ve met people at conferences and many have become friends. There are people I’ve been fortunate to find that have my back when I need help or am in some kind of pickle. I know just who to refer to for what topic and I’ve been the beneficiary of MANY excellent researchers and kind souls. I’m grateful to and for every one.

Project administrators and those of us with specialty skills try to help everyone, but demand has been increasing like a tsunami. Now, that’s the good news, because an incredible number of people are testing, but it’s also the bad news because it necessitates brevity sometimes and a standard reply to many inquiries.

Somehow in the midst of this swirl, over the years, I have found new friends that stand apart from the rest and are truly near and dear to my heart. Some have specific interests that are similar to my own, but others, for some reason, have simply become friends, close friends, near and dear to my heart.

I’ve even adopted a new brother, John, not to be confused with my half-brother John. (Yes, I now have my brother John and my other brother John.)

It’s like we were all destined to meet and have been waiting for this moment all of our lives. Once we do finally meet, it’s like we’ve always known each other.

If you’re one of those people, you know who you are. You are my family of heart.

Family of heart becomes increasingly important as your family of blood becomes smaller and smaller and is geographically distant. In my case, exacerbating the situation, I moved away. I’m not alone though, because many other people are displaced too, becoming effectively an immigrant family of one in a new community someplace with no family nearby. Those people are much more likely, I think, to develop family of heart relationships.

E-mail, Facebook and other forms of communications have made distant friendships easier. It’s easier for family to keep current with each other as well.

Bill and Sandie Lakner

Enter Bill and Sandie Lakner, several years ago.

I would like to tell you that I remember the first communication from Sandie, but I don’t. I do know that what began as questions about DNA results years ago has evolved into shared genealogy hunts, finds, discussions about children, grandchildren, pets, movies, gardens and Hurricane Sandy – not to be confused with Sandie.

Our topics jump around like neighbors chatting over the fence.

We don’t “talk” daily, but often and usually electronically.  We keep in touch and have for years now, defying the odds of internet friendships and short attention spans. We check on each other when we know something difficult is happening in someone’s life or bad weather is bearing down.

Then, last week, I received an e-mail from Sandie telling me that she and Bill would be passing nearby while returning home from a visit to Minnesota in the next day or so.

Could they meet us for coffee?

Could they?

I was so excited and was hoping the schedule would allow more than coffee. As luck would have it, our time was limited, but we made the most of it.

The Quest

What fun we had!

We immediately began discussing Bill’s “secret quest,” or better stated, his quest to solve the family secret.

Bill was hoping his trip to Minnesota would yield information, and maybe, just maybe, a descendent of each of the male children of Joseph Lakner (1876-1926) who is willing to DNA test. Yes, we were discussing paternal ancestry and DNA.

More particularly, which of Joseph Lakner’s sons is Bill’s father?

By the way, if you are the child, either male or female, of one of Joseph Lakner’s male children and are willing to DNA test, please contact me (and I’ll put you in touch with Bill) or simply order a Family Finder test through this link at Family Tree DNA.

Social Faux Pas

Genetic genealogists sometimes forget that our topics aren’t entirely mainstream.

As we sat at our corner table in the local Big Boy, excitedly talking, I said to Bill, “You remember, that was my brother who wasn’t my brother…..”

About that time, the server who was entering orders into a computer turned around with a slack-jawed, rather incredulous, look on his face. I think he had to see just WHO was having this discussion, because…you know…”old people” don’t discuss those kinds of things. These kinds of “things” and resulting scandals were invented by the younger generation…said with tongue firmly in cheek.

The server was standing behind Bill, so Bill couldn’t see, but Sandie and I could. I fought laughter, immediately lowered my voice and attempted to do some amount of social recovery, but in the midst of the next sentence that had something to do with my father being married to both mothers at the same time, the server’s head came whipping around again, this time, with him staring over the top of his glassed to garner a better view.

I mean, who *are* these rowdy people anyway, and did they escape from the facility down the street? They are clearly demented. Should I call someone?

Sandie and I both saw this entire exchange and both began laughing uncontrollably, to the point that we couldn’t speak to explain. The look on Bill’s face only made it funnier, and then the server turned around once again and asked if we were laughing at his shock. Then he tried social recovery, but ran out of words and finally just muttered, “Hmmm….” and shook his head.

The entire exchange left everyone laughing to the point of tears. My poor husband was looking around, hoping no one recognized him.

It felt so good to be laughing together – friends who had been friends “forever” but had never met before.

Family of Heart

By the end of our very short hour or so, we were left wishing we were those neighbors who could visit over the fence. If we lived near each other, Sandie would know where everything in my kitchen is kept and vice versa and the guys would know how to start each other’s lawn mowers. Our kids would know each other, and our pets would greet each other like family. We had met our family of heart.

The field of genetic genealogy has truly blessed me in ways that I never expected and could never have imagined. Not only does DNA connect us across the world, literally, the topic of DNA connects us to one another as well.

Initially Bill’s search was to find his paternal family, specifically which Lakner male is his father. It’s a story to rival any soap opera, is still not solved and Bill would love to find the answer.

But never in our wildest dreams did we ever imagine that through this process, we would become family of choice. Sometimes it’s the human part of the connection that is the most important and not the genetics. Sometimes our family of choice is the best family of all!

______________________________________________________________________

Standard Disclosure

This standard disclosure will now appear at the bottom of every article in compliance with the FTC Guidelines.

Hot links are provided to Family Tree DNA, where appropriate. If you wish to purchase one of their products, and you click through one of the links in an article to Family Tree DNA, or on the sidebar of this blog, I receive a small contribution if you make a purchase. Clicking through the link does not affect the price you pay. This affiliate relationship helps to keep this publication, with more than 850 articles about all aspects of genetic genealogy, free for everyone.

I do not accept sponsorship for this blog, nor do I write paid articles, nor do I accept contributions of any type from any vendor in order to review any product, etc. In fact, I pay a premium price to prevent ads from appearing on this blog.

When reviewing products, in most cases, I pay the same price and order in the same way as any other consumer. If not, I state very clearly in the article any special consideration received. In other words, you are reading my opinions as a long-time consumer and consultant in the genetic genealogy field.

I will never link to a product about which I have reservations or qualms, either about the product or about the company offering the product. I only recommend products that I use myself and bring value to the genetic genealogy community. If you wonder why there aren’t more links, that’s why and that’s my commitment to you.

Thank you for your readership, your ongoing support and for purchasing through the affiliate link if you are interested in making a purchase at Family Tree DNA.

Ethnicity and Physical Features are NOT Accurate Predictors of Parentage or Heritage

Let me say that again, ethnicity results are NOT an accurate predictor of heritage, or parentage. This is a great deal of confusion swirling around this topic. The fact that people are doubting parentage, or grandparentage, based on ethnicity results alone is alarming.

This week I receive this inquiry:

  • I recently found my suspected birth father but he says he’s probably not because he has 2 generations of Amerindian in him and my tests came back negative until I did the analysis at GedMatch and found it to show Amerindian in small traces.

And this:

  • I recently took an ethnicity test and it showed less Scandinavian than it should. My father’s grandfather was from Sweden. Since my Scandinavian is less than 25%, is my father really my father, and is his father really his father? Now I’m really confused and frightened.

Last week, I receive this inquiry:

  • My father and I both tested, but my ethnicity doesn’t all seem to be shared with him. Now I’m doubting whether he is really my father.

And this:

  • I received my ethnicity results, which showed no Native ancestry – but I know my ancestor was Native because she looks Indian in her photo.

And these are, by far, not the only inquiries in this vein. Some variation arrives almost every single day.

Be still my heart. Let me say this again

Why?

First, let’s talk about why, and then I’d like to share what I consider to be a perfect example with you.

Why is ethnicity alone not an accurate predictor of parentage or heritage?

  • The field of population genetics, which is the underlying science beneath ethnicity predictions, is in it’s infancy. This means that if you were to test with the various vendors who offer these tests, your results would come back with different readings, sometimes significantly different readings. And this is just for one person – you – not the combination of two people. You can see my results from various vendors in the article, Which Ethnicity Test is Best?
  • Ethnicity results from all vendors can only be considered estimates based on the people they are comparing your results to (reference panels) and their internal software algorithms.
  • Some vendors have more experience than others.
  • I have seen ethnicity results that reflect an ethnicity for a child that is not included in either parents’ ethnicity results, when the parents are unquestionably the biological parents of the child. Clearly, this can’t be accurate. I suggest reading the article, Ethnicity Testing, a Conundrum, to understand more about how ethnicity estimates are generated.
  • You can easily have an ethnicity not found in one parent, if you inherited that portion of your DNA from the other parent.
  • You may not have inherited a portion of DNA from a parent in which a particular ethnicity is found. Your parent may have it, and you may not have inherited that piece of DNA. For examples of how and why this works, please read the article, Ancestral DNA Percentages – How Much of Them is in You?
  • Ethnicity estimates are only considered to be predominantly accurate at the continent level, specifically, Asia, Europe, Africa, Native American and Jewish. Yes, I know that Native American and Jewish are not continents, but their DNA is different enough from the rest that the presence of Jewish or Native DNA is presumed to be, generally, accurate, unless they are very small amounts which could also be noise.
  • Unless you’ve tracked your ancestors back several generations through genealogy, you won’t have an accurate expectation of the percentages of ethnicity. For an article describing how to do this, please read, Concepts – Calculating Ethnicity Percentages and Concepts – Percentage of Ancestors’ DNA.
  • You do inherit exactly 50% of the DNA of your parents, but you do NOT necessarily inherit 50% of each ancestors’ DNA that your parents carried. For example, if your parent carries 6.25% of a particular ancestor’s DNA, which is equivalent to that of a great-grandparent, you may or may not inherit half, or 3.12%, of that ancestor’s DNA. You will inherit someplace between none and 6.25%. Please read the article, Generational Inheritance, for more information about how DNA is inherited in successive generations.
  • You may not inherit a portion of a specific ancestor’s DNA that reflects a particular ethnic admixture, or at least not that the reference panels used by various companies can identify as associated with that ethnicity today. For more on how companies determine ethnicity, please read Determining Ethnicity Percentages.
  • In the case of minority admixture, meaning when you carry a small amount of admixture from one ethnicity – it may or may not be noise. If it’s genuine, it may or may not be found by ethnicity tests.
  • The absence of an ethnicity in your ethnicity results is not evidence that the specific ethnicity was not present in your ancestor, especially back in time several generations.
  • The lack of an ethnicity in your results does NOT equate to the fact that an ancestor of that ethnicity is not your ancestor. In other words, you can have a Native American ancestor, back several generations, and not show Native American ancestry in your ethnicity results. Absence of evidence is not always evidence of absence.
  • In the case of admixture involving both Native and African, and especially in the US, your Native or African ancestor(s) may have been admixed themselves, so you don’t really know what to expect in terms of percentages.
  • How you look, known as your phenotype, may or may not reflect perceived or real heritage at the level you expect.

Can Ethnicity EVER Predict Parentage?

Ok, given the above, is there an example of where an ethnicity test MIGHT cause us to wonder at parentage?

At one time, I would have said yes, if you “look white” but your presumed parent was considered to be black, or vice versa. I’m using black and white here as examples because in the US, we have a lot of admixture and “white” and “black” are different enough from each other that one would expect to be able to visually tell the difference, especially in relatively recent generations.

However, that’s not always true. Remember the story about the black twin and white twin from the same parents?  Here’s the Snopes confirmation, along with photos.

My Friend, Rosario

Rosario has been most gracious in allowing me to share his story in advance of a book he is currently penning. His journey is particularly poignant, considering the discussion above.

Rosario studied at Harvard and then became…are you ready…an opera singer. Rosario was raised as an Italian man, specifically Sicilian. Fitting, as in Luciano Pavarotti. Those good Italian operatic genes.

Except…Rosario discovered that he isn’t Italian.

What he is, however, is a genealogist.

Rosario’s mother was taken from her parents and raised in foster care. She had a brother who was shipped off elsewhere, to other states, bouncing from one terrible situation to another until his untimely death. Separated as a child, she had little contact with her brother until they were adults, and then only on two occasions. Her brother and her parents were hushed-up secrets.

Rosario’s mother told him that her heritage was Sicilian, and Rosario became, culturally, a Sicilian man.

Interested in the challenge of his mother’s past, and as genealogists are inclined to do, Rosario started digging in like a dog after a bone. He wanted to share his proud Sicilian heritage with his children.

What he found would amaze him, shock him and leave him reeling – all at the same time.

The Truth Surfaces

First, Rosario found inconsistencies.

For example, he found three different birth certificates for his mother. No one has three birth certificates, but his mother did. One without a father’s race, one with the father’s race redacted and then a third one with all information present. The father was identified as “black” but given that Rosario was raised as Sicilian, an area in Europe where people are darker and could be identified as black, that was Rosario’s assumption. Made sense and might also explain the confusion and the three different birth certificate versions.

Maybe.

Rosario’s first real clue came when his DNA results were returned showing the following ethnicity mixture:

  • 18% Sub Saharan African
  • 2% Malagasy
  • 2% Native American
  • 78% European

Rosario didn’t exactly know what to do with these startling results. They couldn’t be true, because his father was white, his father’s parents were white and his mother’s parents were Sicilian.

Years would pass before additional inroads would be made, hindered by the legal system, his mother’s failing health, young children of his own and the lack of relatives. Rosario had no one to ask.

Eventually, Rosario would discover that his grandparents, his mother’s parents, one white and one black, were prosecuted for engaging in sexual activity with each other – in Vermont.

In fact, they were not allowed to marry due to their different races, and their children, Rosario’s mother and her brother, were removed from their parents when the parents were sent to prison for the crime of having sex with someone not of their race.

Rosario’s grandfather was black. And yes, he was sent to prison, for having sex with a white woman – in the northeast – not in the deep south. Rosario’s white grandmother was sent to prison as well, which is when Rosario’s mother was placed in a foster home and her “darker brother” was sent away – far away – to another state where he was caught up in a horrific maze of institutional abuse.

The photo above is from one of only two times as an adult that Rosario’s mother saw her brother.

Given what had already happened to Rosario’s mother, yanked from her parents and brother and placed in a foster home by the age of 9, it’s easy to see why she fabricated the story of her family being Sicilian. Dark-skinned Sicilian was much safer than “half black” in a place and time when people were sent to prison and children ripped from their families. Her brother would eventually commit suicide as the result of the abuses he suffered as a child – and not at the hands of his parents but as a result of horrible system in which he was systematically and repeatedly abused by adults who were supposedly “better” than his law-breaking parents.

For those of you who have never suffered the horrors of a family story in which your parent or grandparents were abused or mistreated, either by people they trusted or a system that was put in place to help them – good for you. But trust me, these revelations change the entire picture of who you think you are, your self-identity – and they will, guaranteed, rock your world to the point of physical nausea and literal nightmares.

The Photo

After adjusting for a bit, trying to absorb his new reality and attempting to come to grips with the abuses suffered by his grandfather, grandmother, mother and uncle, Rosario was beset by a new drive to get to know his until-then-missing grandparents.

Who were these people, as people? What were their lives like, before and after prison? Did they love each other? What did they look like? Were there any pictures?

Rosario looked high and low, and then finally, finally…through a hint planted in his mind in the middle of the night – Rosario woke up knowing the answer.

Earlier this year, Rosario was able to obtain his grandfather, Jerome Barber’s picture – a mugshot, the only photo he, or his mother, has ever seen of this man.

Jerome Barber’s Heritage

If Jerome Barber was entirely “black,” then his child, Rosario’s mother, would have been half black, or 50%, and Rosario would be 25% IF Rosario received exactly 25% of this grandfather’s DNA.

Looking at an expected DNA contribution of 25% African, given a black grandfather, compared to Rosario’s reported rate of 18% sub-Saharan African shows that expectation and reality can vary widely. In this case, there is a 7% difference with only one generation between Rosario and his “black” ancestor. It’s probable that Rosario’s 2% Malagasy and 2% Native also descend from this line based on testing of other family members including his mother and newly discovered relatives on his father’s side.

However, even with Rosario’s 18% sub-Saharan African and a black grandfather, until I told you, one would never look at Rosario and expect him to carry African heritage.

In photos of Rosario’s mother, you’d never guess that she is half black and half white, which is why she was “kept” and placed with a white foster family, while her brother, who was darker, was sent elsewhere. Unfortunately, Rosario’s uncle passed away before DNA testing was available.

So, in this case, Rosario’s phenotype, meaning how he looks, as compared to his genotype, his DNA contents, is deceiving and so is his mother’s.

Rosarios’s mother has DNA tested, and her results show only 28% sub-Saharan African where 50% would have been expected with a 100% black father.

Rosario’s expected amount of sub-Saharan African DNA would be 14% or half of his mother’s 28%, if you are calculating from his mother, but if you are calculating from a fully African grandfather, Rosario’s amount of African DNA would be expected to be 25%. Clearly, Jerome Barber wasn’t entirely black.

Expected percentages of DNA if Rosario’s grandfather was 100% African are shown below for each generation.

Expected Actual Difference
Grandfather 100 unknown unknown
Mother 50 28 -22%
Rosario 25 18 -7%
Rosario’s child 12.5 8 -4.5%

As you can see in the above calculations, based only on Rosario’s grandfather being entirely African, there is a significant difference, especially in his mother’s generation.

Looking at these DNA amounts differently, the next chart shows the expected amount of DNA calculated on the percentage of DNA the parent actually carries. Again, we begin with Rosario’s grandfather at 100%.

Expected Actual Difference
Grandfather 100 presumed unknown unknown
Mother 50 28 -22%
Rosario 14 18 +4%
Rosario’s child 7 8 +1%

Working backwards, given the amount of African DNA that Rosario’s mother has, 28%, Rosario’s grandfather may have only been about 56% African himself.

An awful irony.

Now that you know, you can look at Rosario and his grandfather’s photo together, and you can see the resemblance.

This same scenario works in reverse too. I cannot, tell you how many times people have sent me photographs with the idea that their ancestor “looks Native” but the DNA shows none or a small amount of Native admixture. In those cases, the DNA may show less than expected or no Native admixture because the DNA has washed out in the subsequent generations, the testing panels aren’t picking it up, or the ancestor wasn’t Native to begin with. It’s extremely easy to see a resemblance, especially if it’s something you are looking “for” or expect to see.

Identifying Parentage

If ethnicity isn’t a good predictor and is highly variable, then how does one identify a parent?

As I mentioned previously, every child inherits half of each parent’s DNA. Therefore, if any child and parent both take an autosomal DNA test from a vendor that provides matching and centimorgan (cM) amounts, in addition to ethnicity, you will know for sure if those two people are parent and child.

In the graphic below, I’m showing my mother’s DNA test which shows me as a match at Family Tree DNA.

You can see that the relationship is identified as parent/child, which means, genetically, the software can’t tell which one of us is the parent and which one of us is the child, but only a parent and child will share this amount of DNA.

By the way, the only reason I have my mother’s autosomal results to utilize, above, is because Family Tree DNA archives the DNA of their customers for 25 years, which allowed me to run the autosomal Family Finder test on her DNA years after her death.

You can also see in the chromosome browser, above, that I match my mother on the full length of every chromosome. The gray areas are not measured by the testing companies. Anyone who is not part of a parent/child relationship will not share all of all 22 chromosomes with someone who is not their parent or their child, except for identical twins. Said another way, if you are a parent or child, the entire portion of every chromosome 1-22 will match and be fully colored, as above.

Identical twins will match the full length of every chromosome too, but instead of the child matching 50% of the parent’s DNA, identical twins match exactly – 100% – not 50% – so the software vendors can tell the difference.

You can view the expected amount of DNA sharing for various relationships on this chart from the article, Concepts – Relationship Predictions.

Therefore, if you want to know whether or not someone is a parent, both parties must take an autosomal test at a vendor who provides matching between participants along with the amount of matching DNA and relationship predictions. Ironically, the test that provides the matching is the exact same test that provides ethnicity results – so if you tested at one of these vendors, you don’t have to take another test. You just have to look at matching results, assuming both people tested. Even if both parties aren’t available to test, such as the parent, if you can test a close relative of the purported parent, such as a sibling and still obtain probable confirmation, because close relatives tend to match within prescribed ranges.

Please, don’t just look at ethnicity results and begin questioning, or presuming.

The vendors who provide autosomal tests along with chromosome browsers are Family Tree DNA, used in the examples above, and 23andMe.

Ancestry also reports parent/child relationships and total matching DNA in centiMorgans (cMs), minus some amount of DNA removed by their Timber process, but does not provide a chromosome browser. MyHeritage reports relationships and cM amounts, but their cM matching amounts are problematic today and they do not provide a chromosome browser. Still, one should be able to discern a parent/child relationship from either Ancestry or MyHeritage.

You can read about the various vendor offerings in the article, Which DNA Test is Best?

Genetic Genealogy Tests are Not Legally Binding

Lastly, none of the genetic genealogy tests are legally binding relative to paternity, even though they can and do clearly inform of parentage.

These tests aren’t binding because the testers’ DNA samples lack “chain of custody,” meaning the DNA sample was not given in an environment where the identities of both testers can be legally proven. It would be very easy to return a negative paternity result by having your neighbor or buddy swab or spit for you. In other words, if you are looking for legal proof, to be used in legal proceedings, you need to consult with an attorney, follow their advice and utilize the methodologies, laboratories and procedures in your state or country to achieve your legal goals.

However, if what you are looking for is simply an answer, do NOT, NOT, NOT rely on any ethnicity results or appearances as hints.  Instead look at chromosome matching between the potential child and parent or close relative in the absence of a parent.

Summary

Rosario’s comments relative to ethnicity results and testing are very profound, especially given his recent experiences:

In your published articles, you astutely state the extremely variable nature of the companies’ platforms and methodologies. This begs the question, “is admixture variable or are the companies’ platforms?”

I think that this is the more appropriate question to ask.

People are taking their admixture results literally and that is a dangerous game to play. Families break up over this potent issue. We should tread lightly until we can demonstrate a more scientific conclusion than what is currently being offered.

I agree with Rosario, and would hazard an answer to his question as well.

How much DNA we inherit from any ancestor other than our parents is variable. Which DNA we inherit from any ancestor is variable.

The vendors test results, the reference populations and their internal algorithms are all variable.

Therefore, everything about ethnicity testing is at least somewhat variable – and is exactly why ethnicity testing should NEVER be interpreted as an indicator of parentage.

Chromosome matching is not variable relative to a biological parent/child relationship. Children always inherit half of the autosomal DNA of each parent on chromosomes 1-22.

Correction note:  Jerome’s surname corrected to read Barber.  Jackson was Jerome’s mother’s surname.

______________________________________________________________________

Standard Disclosure

This standard disclosure will now appear at the bottom of every article in compliance with the FTC Guidelines.

Hot links are provided to Family Tree DNA, where appropriate. If you wish to purchase one of their products, and you click through one of the links in an article to Family Tree DNA, or on the sidebar of this blog, I receive a small contribution if you make a purchase. Clicking through the link does not affect the price you pay. This affiliate relationship helps to keep this publication, with more than 850 articles about all aspects of genetic genealogy, free for everyone.

I do not accept sponsorship for this blog, nor do I write paid articles, nor do I accept contributions of any type from any vendor in order to review any product, etc. In fact, I pay a premium price to prevent ads from appearing on this blog.

When reviewing products, in most cases, I pay the same price and order in the same way as any other consumer. If not, I state very clearly in the article any special consideration received. In other words, you are reading my opinions as a long-time consumer and consultant in the genetic genealogy field.

I will never link to a product about which I have reservations or qualms, either about the product or about the company offering the product. I only recommend products that I use myself and bring value to the genetic genealogy community. If you wonder why there aren’t more links, that’s why and that’s my commitment to you.

Thank you for your readership, your ongoing support and for purchasing through the affiliate link if you are interested in making a purchase at Family Tree DNA.

Which Ethnicity Test is Best?

While this question is very straightforward, the answer is not.

I have tested with or uploaded my DNA file to the following vendors to obtain ethnicity results:

The links above provide product reviews of recently released or updated results.

Guess what? None of the vendors’ results are the same. Some aren’t even close to each other, let alone to my known and proven genealogy.

In the article, Concepts – Calculating Ethnicity Percentages, I explained how to calculate your expected ethnicity percentages from your genealogy. As each vendor has introduced ethnicity results, or updated previous results, I’ve added to a cumulative chart.

It bears repeating before we look at that chart that ethnicity testing is relatively accurate on a continental level, meaning:

  • Africa
  • Europe
  • Asia
  • Native American
  • Jewish

Intra-continent or sub-continent, meaning within continents, it’s extremely difficult to tease out differences between countries, like France, Germany and Switzerland. Looking at the size of these regions, and the movement of populations, we can certainly understand why. In many ways, it’s like trying to discern the difference between Indiana and Illinois.

What Does “Best” Mean?

While the question of which test is best seems like it would be easy to answer, it isn’t.

“Best” is a subjective term, and often, people interpret best to mean that the test reflects a portion of what they think they know about their ethnicity. Without a rather robust and proven tree, some testers have little subjective data on which to base their perceptions.  In fact, many people, encouraged by advertising, take these tests with the hope that the test will in fact provide them with the answer to the question, “Who am I?” or to confirm a specific ancestor or ancestral heritage rumor.

For example, people often test to find their Native American ancestry and are disappointed when the results don’t reveal Native ancestry. This can be because:

  • There is no Native ancestor.
  • The Native ancestor thought to be 100% was already highly admixed.
  • The Native ancestor is too far back in the tester’s tree and the ancestor’s DNA “washed out” in subsequent generations.
  • The testing company failed to pick up what might be arguably a trace amount.

Genealogy Compared to All Vendors’ Results

In some cases, discrepancies arise due to how the different companies group their results and what the groupings mean, as you can see in the table below comparing all vendors’ results to my known genealogy.

In the table below, I’ve highlighted in yellow the “best” company result by region, as compared to my known genealogy shown in the column titled “Genealogy %”.

British Isles – The British Isles is fairly easy to define, because they are islands, and the results for each vendor, other than The Genographic Project, are easy to group into that category as well. Family Tree DNA comes the closest to my known genealogy in this category, so would be the “best” in this category. However, every region, shown in pink, does not have the same “best” vendor.

Scandinavian – I have no actual Scandinavian heritage in my genealogy, but I’m betting I have a number of Vikings, or that my German/Dutch is closely related to the Scandinavians. So while LivingDNA is the lowest, meaning the closest to my zero, it’s very difficult to discern the “true” amount of Scandinavian heritage admixed into the other populations. It’s also possible that Scandinavian is not reflecting (entirely) the Vikings, but Dutch and German as a result of migrations of entire peoples. My German and Dutch ancestry cumulatively adds to 39%.

Eastern European – I don’t have any known Eastern European, but some of my German might fall into that category, historically. I simply don’t know, so I’m not ranking that group.

Northwestern Europe – For the balance of Northwestern Europe, 23andMe comes the closest with 43% of my 45.24% from my known genealogy.

Mediterranean and Southern European – For the Mediterranean, Greece, Italy and Southern Europe, I have no known genealogy there, and not even anyplace close, so I’m counting as accurate all three vendors who reported zero, being Living DNA, Family Tree DNA and MyHeritage.

Unknown – The next grouping is my unknown percentage. It’s very difficult to ascribe a right or wrong to this grouping, so I’ve put vendor results here that might fall into that unknown group. In my case, I suspect that some of the unknown is actually Native on my father’s side. I haven’t assigned accuracy in this section. It’s more of a catch all, for now.

Native and Asian – The next section is Native and Asian, which can in some circumstances can be attributed to Native ancestry. In this case, I know of about 1% proven Native heritage, as the Native on my mother’s line is proven utilizing both Y and mitochondrial DNA tests on descendants. I suspect there is more Native to be revealed, both on her side and because I can’t positively attribute some of my father’s lineage that is mixed race and reported to be Native, but is as yet unproven. By proof, I mean either Y DNA, mitochondrial DNA or concrete documentation.

I have counted any vendor who found a region above zero and smaller than my unknown percentage of 3.9% as accurate, those vendors being Family Tree DNA, Ancestry, 23andMe and MyHeritage.

Southwest Asia – I have no heritage from Southwest Asia, which typically means the Indian subcontinent. National Geographic reports this region, but their categories are much broader than the other companies, as reflected by the grey bands utilized to attempt to summarize the other vendor’s data in a way that can be compared to the Genographic Project information. While I’m pleased to contribute to the National Geographic Society through the Genographic Project, the results are the least connected to my known genealogy, although their results may represent deeper migratory ancestry.

Summary

As you can see, the best vendor is almost impossible to pinpoint and every person that tests at multiple vendors will likely have a different opinion of what is “best” and the reasons why. In some ways, best depends on what you are looking for and how much genealogy work you’ve already invested to be able to reliably evaluate the different vendor results. In my case, the best vendor, judged by the highest total percentage of “most accurate” categories would be Family Tree DNA.

While DNA testing for ethnicity really doesn’t provide the level of specificity that people hope to gain, testers can generally get a good view of their ancestry at the continental level. Vendors also provide updates as the reference groups and technology improves.  This is a learning experience for all involved!

I hope that seeing the differences between the various vendors will encourage people to test at multiple vendors, or transfer their results to additional vendors to gain “a second set of eyes” about their ethnicity. Several transfers are free. You can read about which vendors accept results from other vendors, in the article, Autosomal DNA Transfers – Which Companies Accept Which Tests?

I also hope that ethnicity results encourage people to pursue their genealogy to find their ancestors. Ethnicity results are fun, but they aren’t gospel, and shouldn’t be interpreted as “the answer.” Just enjoy your results and allow them to peak your curiosity to discover who your ancestors really were through genealogy research! There are bound to be some fun surprises just waiting to be discovered.

If you are interested in why your results may vary from what you expected, please read “Ethnicity Testing – A Conundrum.”

If you’re interested in taking a DNA test, you might want to read “Which DNA Test is Best?” which discusses and compares what you need to know about each vendor and the different tests available in the genetic genealogy market today.

______________________________________________________________________

Standard Disclosure

This standard disclosure will now appear at the bottom of every article in compliance with the FTC Guidelines.

Hot links are provided to Family Tree DNA, where appropriate. If you wish to purchase one of their products, and you click through one of the links in an article to Family Tree DNA, or on the sidebar of this blog, I receive a small contribution if you make a purchase. Clicking through the link does not affect the price you pay. This affiliate relationship helps to keep this publication, with more than 850 articles about all aspects of genetic genealogy, free for everyone.

I do not accept sponsorship for this blog, nor do I write paid articles, nor do I accept contributions of any type from any vendor in order to review any product, etc. In fact, I pay a premium price to prevent ads from appearing on this blog.

When reviewing products, in most cases, I pay the same price and order in the same way as any other consumer. If not, I state very clearly in the article any special consideration received. In other words, you are reading my opinions as a long-time consumer and consultant in the genetic genealogy field.

I will never link to a product about which I have reservations or qualms, either about the product or about the company offering the product. I only recommend products that I use myself and bring value to the genetic genealogy community. If you wonder why there aren’t more links, that’s why and that’s my commitment to you.

Thank you for your readership, your ongoing support and for purchasing through the affiliate link if you are interested in making a purchase at Family Tree DNA.