Concepts: Anonymized Versus Pseudonymized Data and Your Genetic Privacy

Until recently, when people (often relatives) expressed concerns about DNA testing, genetic genealogy buffs would explain that the tester could remain anonymous, and that their test could be registered under another name; ours, for example.

This means, of course, that since our relative is testing for OUR genealogy addiction, er…hobby, that we would take care of those pesky inquiries and everything else. Not only would they not be bothered, but their identity would never be known to anyone other than us.

Let’s dissect that statement, because in some cases, it’s still partially true – but in other cases, anonymity in DNA testing is no longer possible.

You certainly CAN put your name on someone else’s kit and manage their account for them. There are a variety of ways to accomplish this, depending on the testing vendor you select.

If the DNA testing is either Y or mitochondrial DNA, it’s extremely UNLIKELY, if not impossible, that their Y or mitochondrial DNA is going to uniquely identify them as an individual.

Y and mitochondrial DNA is extremely useful in identifying someone as having descended from an ancestor, or not, but it (probably) won’t identify the tester’s identity to any matching person – at least not without additional information.

If you need a brush-up on the different kinds of DNA and how they can be used for genealogy, please read 4 Kinds of DNA for Genetic Genealogy.

Y and mitochondrial DNA can be used to rule in or rule out specific descendant relationships. In other words, you can unquestionably tell for sure that you are NOT related through a specific line. Conversely, you can sometimes confirm that you are most likely related to someone you match through the direct Y (patrilineal) line for males, and matrilineal mitochondrial line for both males and females. That match could be very distant in time, meaning many generations – even hundreds or thousands of years ago.

However, autosomal DNA, which tests a subset of all of your DNA for the genealogical goal of matching to cousins and confirming ancestors is another matter entirely. Some of the information you discern from autosomal testing includes how closely you match, which effectively predicts a range of relationships to your match.

These matches are much more recent in time and do not reach back into the distant past. The more closely you are related, the more DNA you share, which means that your DNA is identifying your location in the family tree, regardless of the name you put on the test itself.

Now, let’s look at the difference between anonymization and pseudonymization.

It may seem trivial, but it isn’t.

Anonymization vs Pseudonymization

Recently, as a result of the European Union GDPR (General Data Protection Regulation,) we’ve heard a lot about privacy and pseudonymization, which is not the same as anonymized data.

Anonymized data must be entirely stripped of any identifiable information, making it impossible to derive insights on a discreet individual, even by the person or entity who performed the anonymization. In other words, anonymization cannot be reversed under any circumstances.

Given that the purpose of genetic genealogy conflicts with the concept of anonymization, the term pseudonymization is more properly applied to the situation where someone masks or replaces the name of the tester with the goal of hiding the identity of the person who is actually taking the test.

Pseudonymization under GDPR (Article 4(5)) is defined as “the processing of personal data in such a way that the data can no longer be attributed to a specific data subject without the use of ‘additional information.’”

In reality, pseudonymization is what has been occurring all along, because the tester could always be re-identified by you.

However, and this important, neither anonymization or pseudonymization can be guaranteed to disguise your identity anymore.

Anonymous Isn’t Anonymous Anymore

The situation with autosomal DNA and the expectation of anonymity has changed rather gradually over the past few years, but with tidal wave force recently with the coming-of-age of two related techniques:

  • The increasingly routine identification of biological parents
  • The Buckskin Girl and Golden State Killer cases in which a victim and suspect were identified in April 2018, respectively, by the same methodology used to identify biological parents

Therefore, with autosomal DNA results, meaning the raw data results file ONLY, neither total anonymity or any expectation of pseudonymization is reasonable or possible.

Why?

The reason is very simple.

The size of the data bases of the combined mainstream vendors has reached the point where it’s unusual, at least for US testers, to not have a reasonably close match with a relative that you did not personally test – meaning third cousin or closer. Using a variety of tools, including in-common-with matches and trees, it’s possible to discern or narrow down candidates to be either a biological parent, a crime victim or a suspect.

In essence, the only real difference between genetic genealogy searching, parent searches and victim/suspect searches is motivation. The underlying technique is exactly the same with only a few details that differ based on the goal.

You can read about the process used to identify the Golden State Killer here, and just a few days later, a second case, the Cook/Van Cuylenborg double homicide cold case in Snohomish County, Washington was solved utilizing the following family tree of the suspect whose DNA was utilized and matched the blue and pink cousins.

Provided by the Snohomish County Sheriff

A genealogist discovering those same matches, of course, would be focused on the common ancestors, not contemporary people or generations.

To identify present day individuals, meaning parents, victims or suspects, the researcher identifies the common ancestor and works their way forward in time. The genealogist, on the other hands, is focused on working backwards in time.

All three types of processes, genealogical, parent identification and law enforcement depend on identifying cousins that lead us to common ancestors.

At that point, the only question is whether we continue working backwards (genealogically) or begin working forwards in time from the common ancestors for either parent identification or law enforcement.

Given that the suspect’s or victim’s name or identifying information is not known, their DNA alone, in combination with the DNA of their matches can identify them uniquely (unless they are an identical twin,) or closely enough that targeted testing or non-genetic information will confirm the identification.

Sometimes, people newly testing discover that a parent, sibling or half sibling genetic match is just waiting for them and absolutely no analysis is necessary. You can read about the discovery of the identity of my brother’s biological family here and here.

Therefore, we cannot represent to Uncle Henry, especially when discussing autosomal DNA testing, that he can test and remain anonymous. He can’t. If there is a family secret, known or unknown to Uncle Henry, it’s likely to be exposed utilizing autosomal DNA and may be exposed utilizing either Y or mitochondrial DNA testing.

For the genealogist, this may cause Pavlovian drooling, but Uncle Henry may not be nearly so enthralled.

In Summary

Genealogical methods developed to identify currently living individuals has obsoleted the concept of genetic anonymity. You can see in the pedigree chart example below how the same match, in yellow, can lead to solving any of the three different scenarios we’ve discussed.

Click to enlarge any graphic

If the tester is Uncle Henry, you might discover that his parents weren’t his parents. You also might discover who his real parents were, when your intention was only to confirm your common great-grandparents. So much for that idea.

A match between Henry and a second cousin, in our example above, can also identify someone involved in a law enforcement situation – although today those very few and far between. Testing for law enforcement purposes is prohibited according to the terms and conditions of all 4 major testing vendors; Ancestry, 23andMe, Family Tree DNA and MyHeritage.

Currently law enforcement kits to identify either victims or suspects can be uploaded at GedMatch but only for violent crimes identified as either homicide or sexual assault, per their terms and conditions.

Furthermore, both 23andMe and Ancestry who previously reserved the right to anonymize your genetic information and sell or otherwise utilize that information in aggregated format no longer can do so under the new GDPR legislation without your specific consent. GDPR, while a huge pain in the behind for other reasons has returned the control of the consumer’s DNA to the consumer in these cases.

The loss of anonymity is the inevitable result of this industry maturing. That’s good news for genetic genealogy. It means we now have lots of matches – sometimes more than we can keep up with!

Because of those matches, we know that if we test our DNA, or that of a family member, our DNA plus the common DNA shared with many of our relatives is enough to identify us, or them. That’s not news to genealogists, but it might be to Uncle Henry, so don’t tell him that he can be anonymous anymore.

You can pseudonymize accounts to some extent by masking Uncle Henry’s name or using your name. Managing accounts for the same reasons of convenience that you always did is just fine! We just need to explain the current privacy situation to Uncle Henry when asking permission to test or to upload his raw data file to GedMatch (or anyplace else,) because ultimately, Uncle Henry’s DNA leads to Uncle Henry, no matter whose name is on the account.

_____________________________________________________________________

Standard Disclosure

This standard disclosure appears at the bottom of every article in compliance with the FTC Guidelines.

Hot links are provided to Family Tree DNA, where appropriate. If you wish to purchase one of their products, and you click through one of the links in an article to Family Tree DNA, or on the sidebar of this blog, I receive a small contribution if you make a purchase. Clicking through the link does not affect the price you pay. This affiliate relationship helps to keep this publication, with more than 900 articles about all aspects of genetic genealogy, free for everyone.

I do not accept sponsorship for this blog, nor do I write paid articles, nor do I accept contributions of any type from any vendor in order to review any product, etc. In fact, I pay a premium price to prevent ads from appearing on this blog.

When reviewing products, in most cases, I pay the same price and order in the same way as any other consumer. If not, I state very clearly in the article any special consideration received. In other words, you are reading my opinions as a long-time consumer and consultant in the genetic genealogy field.

I will never link to a product about which I have reservations or qualms, either about the product or about the company offering the product. I only recommend products that I use myself and bring value to the genetic genealogy community. If you wonder why there aren’t more links, that’s why and that’s my commitment to you.

Thank you for your readership, your ongoing support and for purchasing through the affiliate link if you are interested in making a purchase at Family Tree DNA, or one of the affiliate links below:

Affiliate links are limited to:

DNAPainter – Mining Vendor Matches to Paint Your Chromosomes

This isn’t quite the same as when my mother used to talk about painting the town, but in genetic genealogy terms, it’s better.

This is the second of 4 articles that will describe how to use DNA Painter.

Today, I’d like to talk about how I utilize the various vendor testing tools combined with DNAPainter to “mine my DNA,” or better put, to mine my ancestor’s DNA which is now mine, pun intended.

To review instructions for how to set up and use the DNA Painter tool, please read DNA Painter – Chromosome Sudoku for Genetic Genealogy Addicts and then come back here to proceed.

I’m going to discuss each vendor’s tools and how I’ve used them, sometimes in combination.

57% Painted

Please note that you can click on any image to enlarge

Is this not a beautiful thing to behold? That’s my ancestors, in loving color, looking back at me, on MY chromosomes.

I’m completely thrilled that I have managed to paint 57% of my chromosomes. I’m a visual person, and while I’ve worked with spreadsheets now for years, I’ve officially abandoned them. Ok, mostly.

Yes, you heard me right – I’ve abandoned the spreadsheets in favor of DNA Painter, at least for segments where I can positively identify an ancestral couple. In other words, those segments that can be reliably mapped.

That 57% is made up of 445 segments in total, split between my maternal and paternal sides. That’s without counting my mother’s DNA. While I do utilize matching to my mother in order to be sure that a match is really a valid match, I didn’t paint her DNA. Obviously, I’m going to match her 100%, and DNA painter already breaks chromosomes into my pink maternal and blue paternal sides.

Key Elements

  1. The single best thing you can do in order to paint your chromosomes is to have known family members and cousins test. You can then paint their DNA that matches yours, attributing it to their identified family line.
  2. The second best thing you can do is to work with your matches using their trees to identify your common ancestor.

Now, you’re ready to begin painting.

I’m going to step through the process I used at each vendor to identify paintable segments.

I did not paint segments that I could not identify to an ancestral line, except for my endogamous Acadian line which I labeled simply as Acadian to mark those segments that I can identify as Acadian, but I can’t identify a specific ancestor, or ancestors. When I can identify the Acadian ancestor, I paint that segment using the ancestors’ names.

Family Tree DNA

At Family Tree DNA, I begin with my closest matches that are not immediate family – meaning not my parents, children or grandchildren. I’m looking for aunts, uncles, cousins, etc. I don’t paint siblings, but often half siblings are extremely useful because they can help you identify which paternal side other matches are related to.

In the first DNA Painter article, I explained how to utilize the Family Tree DNA chromosome browser to select an individual whose matching DNA can be displayed so that you can copy and paste that segment into the painting feature of DNA Painter.

On your results page, your “bucketed individuals” who have been assigned as maternal (pink icon above) or paternal (blue icon not shown) can be a huge clue when used in conjunction with the in-common-with (ICW) tool and the matrix.

You can also search by ancestral surname and then evaluate each match through common surnames, trees and other resources. If you’re not familiar with how to use the tools at Family Tree DNA, here’s a quick run-through.

Select the individual whose DNA you wish to paint, view in the chromosome browser, then copy and paste from the grid below to the DNAPainter tool.

I painted the matching DNA of all the people whose common ancestor with me I could positively identify before moving on to the next vendor.

Who Have I Painted?

As you begin to paint segments from multiple vendors, you may wonder if you’re finding duplicates. It’s easy to tell. At DNA Painter, click on “All segment data,” below the legend in the bottom right corner.

This displays the entire list of matches whose DNA you have painted, in spreadsheet format. You can sort by match name or simply do a browser search. (CTRL+F)

You can also download this data into a cvs (Excel compatible) file at the top left of this page.

Avoiding Duplicates

As you view and paint your matches at the various vendors, you may discover that you have already found a match with that person at another vendor, either because they tested there or uploaded their autosomal file. When possible, avoid duplicate painting. It won’t help anything and will just clutter your chromosomes. You may not always be able to identify a match as a duplicate, especially if the tester utilizes a pseudonym at various locations. Don’t’ worry though, because you can always easily delete it later and a duplicate person/segment certainly won’t hurt anything.

Ok, now to our next vendor! Let’s find more segments to paint.

MyHeritage

At MyHeritage, click on DNA matches.

At the right of the search box, fly over the little pink key (or funnel) looking thing and you’ll see the option for “Has Smart Matches.” That’s what you’re looking for.

Click on the key icon.

Smart Matches mean that your DNA matches and you have a common ancestor in your trees. Click on the purple button to review this DNA match.

For each match, scroll all the way down to the bottom where your matching chromosome segments will be colored.

At the right, above the chromosome browser, click on “advanced options” which will allow you to select “download shared DNA info.” You need to download to your system so that you can copy and paste the matching segment information to DNA Painter.

MyHeritage has a few more columns than necessary, and DNA Painter can’t utilize them. Delete the columns for Name, Match Name, RSID beginning and end, and also eliminate SNPs due to an overestimation issue. In many cases, the SNPs at MyHeritage are twice or more than the number of SNPs when comparing the same segment at other vendors.

Now that your segment is cleaned up, copy the entire group shown above, minus the yellow columns which you’ve deleted, and paste into the DNA Painter spreadsheet.

MyHeritage has recently added a triangulation feature, shown at the far right, below, indicating that these two people individually triangulate with me and Alberta. The icon at far right of “5th cousin” indicates triangulation.

By clicking on the triangulation icon, you then see how that person triangulates with both your match and you – in this case, me, Alberta, and Chandler.

You may choose to paint triangulated segments, BUT, the size of the triangulated segment is often going to be smaller than the amount of DNA than you match individually to either one or both people.

In the example above, you can see that you match the pink person on a significantly longer segment than you match the tan person. The amount of DNA where you match both the pink and tan person is smaller yet, because the area where you match the tan person extends beyond where you match the pink person and vice versa. If you were going to paint ONLY the triangulated segments, you would paint only the portion that is both pink and tan, “boxed” above.

I don’t recommend painting ONLY triangulated segments, because you’ll be depriving yourself of the ability for each person to match others on the portions of the segments on which they match you, but not the other person in question.

In this example, utilizing DNA Painter, you’ll see that people in fact match you AND the pink person on several segments. The segment shown in pink, at MyHeritage, above, is shown on chromosome 5 in DNA Painter as the long mustard colored segment. Look at how many people match you on that segment. This is why we don’t paint only the triangulated portions of the chromosome. That long mustard segment match will triangulate with many people on smaller portions of that mustard segment, as evidenced by the yellow, grey, blue, cinnamon, purple and red segment matches..

DNA Painter helps you triangulate, so there is no reason to restrict your painting to triangulated segments.

Triangulation is a great tool, but don’t mix triangulated segments with matching segments in the same profile, at least not until you get the hang of the tool and using the multiple vendor’s results.

23andMe

Unfortunately, 23andMe doesn’t have tools like tree matching (MyHeritage) or maternal/paternal phasing (Family Tree DNA,) but they do allow testers to enter common surnames.

Looking at closer matches, meaning first, second or third cousins, if they list even a few surnames, you may well be able to identify the common genealogical line, especially in conjunction with ancestral locations and the other people you match in common.

Sometimes you can glean enough information to identify your common ancestor. In this case, even if I didn’t know Cheryl, the surname would have identified the ancestor. If that didn’t do it, the “in common” list below would!

Once you’ve identified the common ancestor and decide you’re ready to paint, click on the Tools tab at the top of your page and select DNA Relatives.

On the DNA Relatives tab, click on the relative whose DNA you wish to paint. I’m selecting my cousin, Cheryl.

Click on the blue DNA Comparison, in the upper right hand corner.

On the comparison screen, you will select yourself as one person and Cheryl as the other.

At the top you’ll see the two individuals and their overlapping segments painted onto chromosomes. Scroll down and you’ll see the segment detail, below.

Highlight the rows (they’ll turn blue, like above) and right click to copy the segment information.

The next step is to drop the results into a spreadsheet, just long enough to delete the first and last columns, shown in red below, then copy the remaining rows and paste into the DNA Painter tool.

Mining Ancestry Data at GedMatch

GedMatch is somewhat of a special case, because GedMatch doesn’t do DNA testing, but provides an open sharing platform by facilitating uploads of raw autosomal files from multiple other vendors. Therefore, anyone with results at GedMatch tested elsewhere. If you tested at all of the other vendors, it’s probable that you find people at GedMatch as a match that match you at other vendors too.

Because 23andMe does not support the uploading of Gedcom files, if your match has uploaded a Gedcom file to GedMatch, or connected to Geni or WikiTree, then you may be able to identify your common ancestor at GedMatch that you were not able to identify at 23andMe.

Conversely, if you match at Ancestry, you won’t be able to paint from Ancestry, because Ancestry does not provide segment information. We will talk about Ancestry as a special case next, but for now, let’s focus on how to utilize GedMatch.

At GedMatch, you’ll work in steps after setting your account up and uploading your raw data file from either:

If you tested elsewhere, or after August of 2017 at 23andMe, you will have to upload to a special section called GedMatch Genesis. GedMatch Genesis provides a sandbox area for files other than the ones listed above that are generally incompatible with those files and with each other. Genesis files often have few SNP locations in common and not enough to match reliably.

I do not recommend DNA painting utilizing segments from GedMatch Genesis.

GedMatch is currently merging their regular GedMatch service with the Genesis service, so I’m not entirely clear how you will tell the difference between the kits known to match reliably, mentioned above, and others after the merge.

Currently, kits with T prefix (Family Tree DNA), A (Ancestry) and M (23andMe) show version levels in the type field when you match in regular GedMatch. MyHeritage kits are processed by the Family Tree DNA lab. G kits used a generic upload, so you can’t tell where they originated.

Kits uploaded in the Genesis sandbox seem to be assigned double alpha letter kit prefixes at random. Genesis includes a “Testing Company” field which does not include a version number. Today, just stay with the regular GedMatch one-to many and one-to-one matching for DNA Painter.

First, you’ll want to perform a one-to-many match.

This page shows your closest 2000 results. In my case, truncating my matches at 12.7cM. This means if I want to see my results below 12.7 cM, I must subscribe to the Tier 1 Utilities in order to be able to display over 2000 matches.

We’ll discuss how to utilize Tier 1 matching in the Ancestry portion, next, but for now, we’ll just be working with the regular one-to-many matches report.

Of course, trusty cousin Cheryl has results here as well.

In order to compare Cheryl’s results to my own, I need to do two separate things:

  • Click on the A link under the Autosomal Details column (above) and/or
  • Click on the X link under the X DNA column

These two results, both of which are paintable, do not display together so must be selected separately.

By clicking on the A or X, GedMatch will display a one-to-one comparison. I leave this page (below) at the default values and simply click submit.

Your next screen will be a match grid.

Once again, select and copy the results, then paste into DNA Painter. If you also have an X match with this individual, return to the one-to-many match page and then click on the X link to repeat the same process for the X chromosome.

Ancestry Through GedMatch

As far as I’m concerned, the best thing about Ancestry matches is DNA shared ancestor hints (SAH) – meaning those green leaves visible near the green “view match” button which indicate that you share both DNA and a common ancestor(s) in your trees.

Followed immediately by the worst thing which is that Ancestry provides no segment data. However, pairing Ancestry with GedMatch can provide you with some segment information, although you do have to dig. That digging was certainly worthwhile for me, as I found several readily identifiable matches.

When I find a green leaf shared ancestor hint at Ancestry, I record as much information about that match as I can in a spreadsheet. The reason is twofold.

  • Ancestry hints tend to come and go, rather inexplicable, and I want to have that information someplace besides at Ancestry
  • I want to be able to view how many matches I have through specific ancestors which I can do in a spreadsheet by sorting.
  • I want to be able to mine GedMatch for segment information for people at Ancestry who have uploaded to GedMatch.

Note the RJE V2 results, a 6th cousin who I match at 6.6 cM, as we’ll be using that at GedMatch.

I maintain several columns in my Ancestry Match spreadsheet, as shown above. I track people who might be good Y or mitochondrial DNA candidates, as well as GedMatch numbers or other useful information.

I don’t utilize segments smaller than 7 cM for DNA Painter, BUT, Ancestry almost always under-reports the matching segment size due to their internal process which removes some segments that do match. Therefore, I search for all Ancestry matches in GedMatch and paint them if they are 7cM or over at GedMatch. You will match at Ancestry down to 6 cM. Since 7cM is the default GedMatch threshold, that works out well. I don’t find them if they are under 7cM at GedMatch, and I don’t care.

In my case to obtain segments smaller than 12.7 cM, because that is the cutoff where the free one-to-many GedMatch tool reaches the 2000 match threshold (for me,) I need to utilize the Tier 1 subscription utilities which are well worth every dollar.

The one-to-many match looks quite different for the Tier 1 tool.

You’ll need to play with this a bit to determine how high you need to set the limit to see all of your 7cM matches. In my case, I had to set it to 20,000.

I utilize two monitors, so I display my Ancestry spreadsheet on the first monitor and the GedMatch one-to-many match table on the second monitor.

Then, utilizing the browser’s search function, I search for any identifiable portion of the information for the Ancestry match at GedMatch.

In the first example, the user’s name is RJE V2. I search at GedMatch for “RJE” using “ctrl+F” which is the browser’s find function.

You can see that the search found a total of 3 different “RJE” entries. Looking at the first 2, you can see that one is labeled V4 and one is labeled V2. Typically, I would look at this and decide that the RJE V2 is the right match based on the user name at Ancestry.

However, look closer.

The RJE V2 at GedMatch has a much higher amount of shared DNA at 3587.1 cM total than the RJE V2 at Ancestry with a total of 6.6 cM. Clearly, this is not the same person, even though the user name is the same.

For all we know, a different person may have used the same user name, which is clearly an alias, noted by the “*”. Or the same person may have multiple kits at GedMatch.

However, in this case, the RJE V2 is not the same match.

However, let’s say that it is the same person and we’ve been able to reasonably identify the match. In order to compare one-to-one, click on the highlighted blue “largest segment” in the autosomal category, shown below.

If you want to compare the X one-to-one, click on the blue largest segment in that column.

From this point, the matching will look the same as the one-to-one GedMatch matching shown in the previous section – so copy and paste as normal.

While this certainly isn’t the most effective way of working with Ancestry matches, it’s really the only hope we have, unless your match has also uploaded to either Family Tree DNA or MyHeritage.

However, in my experience, I generally stand a better chance of identifying Ancestry matches at GedMatch because their user name or the user name of the person managing their account can be found much more readily. People sometimes tend to utilize the same abbreviations, names or nicknames in multiple locations.

Summary

While each vendor has unique strengths and weaknesses today, and GedMatch provides a platform used by some but not all, the best way to effectively paint your chromosomes is to utilize all of the tools available, and sometimes together. I strongly suggest that you test at or upload to each vendor, because you will find matches at each vendor that aren’t elsewhere.

How many segments can you paint on your chromosomes, and what will those segments tell you?

In the next article, I’ll be walking through my chromosome painting gallery to take a look at the hidden messages there! I hope you’ll come along so you can find some hidden messages of your own.

Enjoy!

_____________________________________________________________________

Standard Disclosure

This standard disclosure appears at the bottom of every article in compliance with the FTC Guidelines.

Hot links are provided to Family Tree DNA, where appropriate. If you wish to purchase one of their products, and you click through one of the links in an article to Family Tree DNA, or on the sidebar of this blog, I receive a small contribution if you make a purchase. Clicking through the link does not affect the price you pay. This affiliate relationship helps to keep this publication, with more than 900 articles about all aspects of genetic genealogy, free for everyone.

I do not accept sponsorship for this blog, nor do I write paid articles, nor do I accept contributions of any type from any vendor in order to review any product, etc. In fact, I pay a premium price to prevent ads from appearing on this blog.

When reviewing products, in most cases, I pay the same price and order in the same way as any other consumer. If not, I state very clearly in the article any special consideration received. In other words, you are reading my opinions as a long-time consumer and consultant in the genetic genealogy field.

I will never link to a product about which I have reservations or qualms, either about the product or about the company offering the product. I only recommend products that I use myself and bring value to the genetic genealogy community. If you wonder why there aren’t more links, that’s why and that’s my commitment to you.

Thank you for your readership, your ongoing support and for purchasing through the affiliate link if you are interested in making a purchase at Family Tree DNA, or one of the affiliate links below:

Affiliate links are limited to:

DNA Painter – Chromosome Sudoku for Genetic Genealogy Addicts

Not long ago, Jonny Perl introduced the free online tool, DNA Painter, designed to paint your chromosomes. I didn’t get around to trying this right away, but had I realized just how much fun I would have, I would have started sooner.

Fittingly, Jonny, pictured above, won the RootsTech Innovation award this year for DNA Painter – and I must say, it’s quite well-deserved.

Congratulations Jonny!

  • This is the first of four articles about DNA Painter. In this article, we’ll talk about how to use the tool, and how to get started.
  • The second article talks about mining your matches at the various vendors for paintable segments with instructions for how to do that accurately with each vendor.
  • In the third article, we’ll walk through an analysis of my painted segments, so you can too – and know how to spot revelations.
  • The fourth article explains how I solved a long-standing mystery that was driving me crazy. If you have a relatively close mystery person in your DNA match list that you can’t figure out quite where they fit, this article is written just for you!

I’ll tell you right now, I haven’t had this much fun in a long time!

Want to hear the best part? You don’t have to triangulate. DNA painting is “self-triangulating.” Yes, really!

Let’s get started!

Introducing DNA Painter

To begin to use DNA Painter, you’ll need to set up a free account at www.dnapainter.com.

Read the instructions and create your profile.

Jonny provides an overview.  Don’t get so excited that you skip this, or you won’t know how to paint correctly. You don’t need to be Picasso, but taking a few minutes up front will save you mistakes and frustration later.

Blaine Bettinger recorded a YouTube video discussing how to use DNA Painter to paint your chromosomes to identify and attribute particular segments to specific ancestors. It includes a mini-lesson on chromosome matching.

I strongly suggest you take time to watch Blaine’s video from the beginning. For some reason, this link drops into the video near the end, but just slide the red bar back to the beginning.

Get Started

Here’s my blank, naked chromosomes. Notice for every chromosome, you see a blue paternal “half” and a pink maternal “half.” That’s because everyone gets half of their autosomal DNA from their father, and the other half from their mother.

Looking at my own chromosome painting today, below, it’s incredibly exciting for me to see 57% of my DNA painted, attributed to 77 couples and one endogamous group, Acadians. This took me a month or so working off and on.

At the end of the day, this is often how I rewarded myself! The only problem it that it has been difficult to go to bed.

Comparatively, I’ve been working on my DNA match spreadsheet, attributing segments to ancestors now for 5 or 6 years, and I’ve never been able to see this information visually like this before. This view of my ancestrally painted chromosomes is so rewarding!

Who To Map

DNA Painter is not the kind of tool where you upload your results, it’s a tool where you selectively paint specific segments of matches – meaning segments on which you match particular people with known common ancestors.

How do you know who is a good candidate to map?

I began with painting my closest matches with whom I could identify the common ancestor.

Not only will painting your largest matches be rewarding as you harvest low-hanging-fruit, it will help you determine if you actually have identified the correct DNA for later matches being attributed to a specific genealogical line. In other words, mapping these larger known segments will help you identify false positives when you have no other yardstick.

Your First Painting

I’m opening a new profile in DNA Painter to demonstrate the steps in painting along with hints that I’ve learned along the way.

I’m going to utilize my cousin, Cheryl, whom I match closely at Family Tree DNA. If you don’t know how to use the Family Tree DNA autosomal tools, click here.

Cheryl is my first cousin once removed, so we share a significant amount of DNA.

I’ve selected Cheryl on my match list, checked her match box, and then clicked on the Chromosome Browser in order to view our segment matching information.

You can see on the chromosome browser that I share quite a bit of DNA with Cheryl.

At the top of the chromosome browser, click on “View this data in a table.”

Highlight and copy all of the segments for Cheryl. I only use 7cM segments or higher at DNA Painter, so you don’t have to copy the data in the rows below your last match at that level. DNA Painter takes care of stripping out all the extraneous stuff.

Paint a New Match

At DNA Painter, after you have your profile set up, click on “Paint a New Match.”

Simply paste the segment data into the box in the window that pops up. DNA Painter takes care of removing the header information as well as segments that are too small.

You can click on “overlay these segments” to “test” a fit, but I haven’t really found a good use for that, because I’m only painting segments I’m confident about and I know which side, maternal or paternal, the match is on based on the known relative.

Click on “save match now” in the bottom right corner.

In the Save Match popup, shown above, I utilize the fields as follows.

I enter the name of my DNA match, followed by their relationship to me, followed by the source of the match. In this case, “Cheryl <lastname>, 1C1R, FTDNA”

In the “Segment/Match Notes” I list how the match descends from the common ancestral couple, a GedMatch ID if known, and anything else pertinent including other potential ancestral lines in common. This means that I list every generation beginning with the common ancestral couple and ending with the tester.

Hiram Ferverda and Eva Miller, Roscoe, Cheryl, GedMatch Txxxxxx

You’ll wind up eventually rethinking some of your segment assignments to particular ancestors and you’ll want as much information here about this match as possible.

Moving to the next field, in the “Ancestors Name,” I utilize the couples name, because at this point, you can’t tell which of the two people actually contributed the DNA segment, or if part is from one ancestor of the couple and part is from the other. If the male ancestor is a Sr. or Jr., or is otherwise difficult to tell apart from your other ancestors, I suggest entering a birth year by his name. This is your selection list for later painting segments from the same ancestor, so you want to be sure you can tell the generations apart.

Next, you’ll select the maternal or paternal side of your family. Change the color if you don’t like the one pre-selected to assign to segments descending from that couple. Originally, I was going to have pinks or light colors for maternal, and blues or darker for paternal, but I quickly discovered that scheme didn’t work well, and I had more ancestors than I could ever have imagined whose DNA I am be able to map and paint.

Therefore, pick contrasting colors. You can use each color on each half, meaning maternal and paternal, since the segments will be painted on different halves of the chromosome.

In the “Notes for This Group,” I add more information for the couple such as birth and death dates and location if I know or am likely to forget.

Click “save.”

Here you go!  Isn’t this fun!!!! Cheryl’s segments that match mine are painted onto my chromosomes!

At the right, your ancestor key appears with each ancestor to whom you’ve assigned a color key.

So far, I only have one!

Want to paint another group of segments?

Let’s paint Cheryl’s brother.

Following the same sequence, I paint Donald’s DNA, but this time, I select “Or link these segments to an ancestor I’ve added before.”

I select Hiram Ferverda, Eva Miller and save. The segments that I have in common with Cheryl and/or Don will now be displayed on each chromosome.

Looking at chromosome 1, you can see that I match Cheryl and Don on the same segment at the beginning of the chromosome, but received two different segments of DNA on a different portion of chromosome 1, further to the right.

As one last example, I added the DNA from two known cousins, Rex and Maxine, who descend a couple generations further back in time through more distant ancestors in the same line – one maternal and one paternal.

Click on the chromosome number to expand to see all of the painted segments

You can see, looking at chromosome 3 that Cheryl and Don match me on a significant amount of the same large pink segment plus a smaller pink segment at the end

Rex (yellow) and Maxine (blue) both match me on different parts of the chromosome. It looks like there is a small amount of overlap between Rex and Maxine which is certainly feasible, because Jacob Lentz, the ancestor that Maxine descends from is ancestral to the couple that Rex descends from.

By utilizing known matches, and mapping, we can see segments that move us back in time, telling us from which ancestor that portion of the segment descends.

For example, if the blue segment was directly aligned with one of the pink segments, then we would know that the blue portion of the pink segment descended from Jacob Lentz and Fredericka Reuhl.

This is the most awesome, extremely addictive game of ancestor Sukoku ever.

Wanna play???

Here’s how to prepare for my next article where we’ll utilize the various vendor matches to begin painting.

Download and Upload Your Autosomal Files

You’ll want to have your DNA at the most vendor locations possible so you can find all your matches that can be attributed to known relatives and ancestors. You never know who is going to test at which vendor, and the only way to find out is to have your DNA there too.

For each vendor, I’ve provided a mini-tutorial on how to maximize your testing and transfers both monetarily and for maximum matching effect, or you can read an article here that explains more.

There’s also a cheat sheet for transfer strategies at the end of this article.

A technique called imputation is mentioned below, so you may want to read about imputation here. MyHeritage’s initial offering utilizing imputation was problem plagued but has since improved significantly.

Ancestry

To Ancestry – There’s no way to transfer files TO Ancestry, so you’ll need to test there to be in their database. You will also need at least a minimum subscription ($49) to utilize all of the Ancestry DNA features. You can see a with and without subscription feature comparison chart here.

From Ancestry – There is also no chromosome browser at Ancestry. In order to use DNA Painter, chromosome segment information is required, so if you test at Ancestry and want to paint your segments, you’ll need to download your DNA file to either or all of:

  • Family Tree DNA – partially compatible with the current Ancestry test chip format – transfer will provide you with your closest matches, 20-25% of the matches you would have if you tested at Family Tree DNA
  • MyHeritage – partially compatible, but uses imputation to infer additional genetic regions
  • GedMatch

My preference is to test at Ancestry, and then test at Family Tree DNA and upload the test results to MyHeritage. The Family Tree DNA and MyHeritage testing platforms are the same, so there is no incompatibility between the two.

Family Tree DNA

To Family Tree DNA – You can upload the following vendor files TO Family Tree DNA.  Matching is free, but to use the advanced tools, including ethnicity and the chromosome browser, you’ll need to pay the $19 unlock fee. That’s still significantly less than retesting, especially for files that are 100% compatible.

  • Ancestry – V1 files generated from before May 2016 are entirely compatible, V2 files from after May 2016 are partially compatible, providing between 20-25% of your matches, meaning your closest matches
  • 23andMe – V3 file from Dec 2010-Nov 2013 and V4 file from November 2013-August 2017 are compatible, the V5 platform file beginning in August 2017 is not compatible
  • MyHeritage – fully compatible

From Family Tree DNA – You can upload your Family Finder results to:

MyHeritage

To MyHeritage – You can upload the following files to MyHeritage:

  • Family Tree DNA – fully compatible
  • Ancestry – partially compatible but uses imputation to infer additional genetic regions
  • 23andMe – partially compatible but uses imputation to infer additional genetic regions

From MyHeritage – If you test at MyHeritage, you can upload your files to:

23andMe

To 23andMe – You cannot transfer TO 23andMe, so you’ll need to test there if you want to be in their database.

From 23andMe – If you tested at 23andMe, you can upload your files to the following vendors:

  • Family Tree DNA – V3 file from Dec 2010-Nov 2013 and V4 file from November 2013-August 2017 are compatible, the V5 chip beginning in August 2017 is not compatible
  • MyHeritage – 23andMe – partially compatible but uses imputation to infer additional genetic regions
  • GedMatch – V3 file from Dec 2010-Nov 2013 and V4 file from November 2013-August 2017 are compatible, the V5 chip beginning in August 2017 is only compatible in the Genesis sandbox area. V5 matching is not reliable. Files from other vendors are recommended for GedMatch unless you are matching against another V5 result.

GedMatch

GedMatch is a third-party site that accepts all of these vendors’ autosomal files, with a caveat that the 23andMe V5 kit matches very poorly and requires special handling. I don’t recommend using that kit at GedMatch unless you are matching against other 23andMe V5 kits.

I upload multiple kits to GedMatch and mark all but one for research only. This allows me to use my Ancestry kit to match with other Ancestry users for more accurate matches, my Family Tree DNA kit to other Family Tree DNA kits, and so forth. Not marking multiple kits for research means that you’ll appear more than once on other people’s match lists, and only your first 2000 matches are free. Marking all kits except one as research is a courtesy to others.

Recommended Testing Strategy for New Testers

  1. Test at Ancestry and download to GedMatch.
  2. Test at Family Tree DNA and upload to MyHeritage and GedMatch.
  3. Test at 23andMe and upload to GedMatch Genesis.
  4. At GedMatch, mark all except one kit as “research,” then utilize your kits from the same vendor for one-to-one comparisons.

Recommended Transfer Strategy

Of course, where you have, and haven’t already tested will impact your transfer strategy decision. I’ve prepared the following cheat sheet to be used in combination with the information discussed above.

*Unless you can transfer a 23andMe V3/V4 or an Ancestry V1 kit to Family Tree DNA, it’s better to test at Family Tree DNA. Ancestry V2 tests are only 20-25% compatible.

A transfer from Family Tree DNA to MyHeritage is best because those vendors are on the same platform and the tools at MyHeritage are free.

In my next article, we’ll discuss how to mine your matches at the various vendors to obtain accurate segments for chromosome painting – including a strategy for how to utilize Ancestry and Gedmatch together to identify at least some Ancestry segment matches.

So, for now, get ready by transferring your matches into whichever data bases they aren’t already in. The only data base where I couldn’t identify matches that I didn’t have elsewhere was at 23andMe. The rest were all there just waiting to be harvested!

_____________________________________________________________________

Standard Disclosure

This standard disclosure appears at the bottom of every article in compliance with the FTC Guidelines.

Hot links are provided to Family Tree DNA, where appropriate.  If you wish to purchase one of their products, and you click through one of the links in an article to Family Tree DNA, or on the sidebar of this blog, I receive a small contribution if you make a purchase.  Clicking through the link does not affect the price you pay.  This affiliate relationship helps to keep this publication, with more than 900 articles about all aspects of genetic genealogy, free for everyone.

I do not accept sponsorship for this blog, nor do I write paid articles, nor do I accept contributions of any type from any vendor in order to review any product, etc.  In fact, I pay a premium price to prevent ads from appearing on this blog.

When reviewing products, in most cases, I pay the same price and order in the same way as any other consumer. If not, I state very clearly in the article any special consideration received.  In other words, you are reading my opinions as a long-time consumer and consultant in the genetic genealogy field.

I will never link to a product about which I have reservations or qualms, either about the product or about the company offering the product.  I only recommend products that I use myself and bring value to the genetic genealogy community.  If you wonder why there aren’t more links, that’s why and that’s my commitment to you.

Thank you for your readership, your ongoing support and for purchasing through the affiliate link if you are interested in making a purchase at Family Tree DNA, or one of the affiliate links below:

Affiliate links are limited to:

GDPR – It’s a Train and It’s a Comin’

In the recent article about Oxford Ancestors shuttering, I briefly mentioned GDPR. I’d like to talk a little more about this today, because you’re going to hear about it, and I’d rather you hear about it from me than from a sky-is-falling perspective.

It might be rainy and there is definitely some thunder and the ground may shake a little, but the sky is not exactly falling. The storm probably isn’t going to be pleasant, however, but we’ll get through it because we have no other choice. And there is life after GDPR, although in the genetic genealogy space, it may look a little different.

And yes, one way or another, it will affect you.

What is GDPR?

GDPR, which is short for General Data Protection Regulation, is a European, meaning both EU and UK, regulation(s) by which the European Parliament, the Council of the European Union, and the European Commission intend to strengthen and unify data protection for all individuals within the European Union (EU). It also addresses the export of personal data outside the EU/UK and processing of data of residents of the EU/UK by non-EU/UK companies.

There are actually two similar, but somewhat different regulations, one for the UK and one for the EU’s 28 member states, but the regulations are collectively referred to as the GDPR regulation.

Ok, so far so good.

The regulations are directly enforceable and do not require any individual member government to pass additional legislation.

GDPR was adopted on April 27, 2016, but little notice was taken until the last few months, especially outside of Europe, when the hefty fines drew attention to the enforcement date of May 25, 2018, now just around the corner.

Those hefty fines can range from a written warning for non-intentional noncompliance to a fine of 20 million Euro or up to 4% of the annual worldwide turnover of the preceding financial year, whichever is GREATER. Yea, that’s pretty jaw-dropping.

So, GDPR has teeth and is nothing to be ignored.

Oh, and if you think this is just for EU or UK companies, it isn’t. It applies equally to any company that possesses any data of any EU or UK resident in their data base or files, providing that person isn’t dead. The law excludes dead people and makes some exceptions for law enforcement and other national security types of applications.

Otherwise, it applies to everyone in a global economy – and not just for future sales, but to already existing data for anyone who stores, transmits, sells to or processes data of any EU resident.

What Does GDPR Do?

The intent of GDPR was to strengthen privacy and data protections, but there is little latitude written into this regulation that allows for intentional sharing of data. The presumption throughout the hundreds of pages of lawyer-speak is that data is not intended to be shared, thereby requiring companies to take extraordinary measures to encrypt and anonymize data, even going so far as to force companies to store e-mail addresses separately from any data which could identify the person. Yes, like a name, or address.

Ironic that a regulation that requires vendor language be written in plainly understood simple wording is in and of itself incredibly complex, mandating legal interpretation.

Needless to say, GDPR requirements are playing havoc with every company’s data bases and file structure, because information technology goals have been to simplify and unify, not chop apart and distribute information, requiring a complex network of calls between systems.

Know who loves GDPR? Lawyers and consultants, that’s who!

In the case of intentional sharing, such as genetic genealogy, these regulations are already having unintended consequences through their extremely rigid requirements.

For example, a company must appoint a legal representative in Europe. I am not a lawyer, but my reading of this requirement suggests that European appointed individual (read, lawyer) is absorbing some level of risk and could potentially be fined as a result of their non-European client’s behavior. So tell me, who is going to incur that level of risk for anything approaching a reasonable cost?

One of the concepts implemented in GDPR is the colloquially known “right to be forgotten.” That means that you can request that your data and files be deleted, and the company must comply within a reasonable time.

However, what does “the right to be forgotten” mean, exactly? Does it mean a company has to delete your public presence? What about their internal files that record that you WERE a customer. What about things like medical records? What about computer backups which are standard operating procedure for any responsible company? What happens when a backup needs to be restored? If the company tracks who was deleted, so they can re-delete them if they have to restore from backup, then the person isn’t deleted in the first place and they are still being tracked – even though the tracking is occurring so the person can be re-forgotten.

Did you follow that? Did it make sense? Did anyone think of these kinds of things?

Oh, and by the way, there is no case law yet, so every single European company and every single non-European company that has any customer base in Europe is scrambling to comply with an incredibly far-reaching and harsh regulation with extremely severe potential consequences.

How many companies do you think can absorb this expenditure? Who do you think will ultimately pay?

Younger people may not remember Y2K, but I assuredly do, and GDPR is Y2K on steroids and with lots of ugly teeth in the form of fines and penalties that Y2K never had. The worse scenario for Y2K was that things would stop working. GDPR can put you out of business in the blink of an eye.

Categories of “Processors”

GDPR defines multiple levels of “processors,” a primary controller and a secondary processor plus vaguely defined categories of “third party” and “joint controller.”

The “controller” is pretty well defined as the company that receives and processes the data or order, and a “processor” is any other entity, including an individual person, who further processes data on behalf of or as a result of the controller.

There appears to be no differentiation between a multi-million-dollar company and one person doing something as a volunteer at home for most requirements – and GDPR specifically says that lack of pay does not exempt someone from GDPR. The one possible exception that exists in that there is an exclusion for organizations employing less than 250 persons, ”unless processing is likely to result in a risk to the rights and freedoms of the data subject.” I’m thinking that just mentioning the word DNA is enough to eliminate this exemption.

Furthermore, GDPR states that controllers and processors must register.

Right about now, you’re probably asking yourself if this means you if you’re managing multiple DNA kits, working with genetic genealogy, either as a volunteer or professionally, or even managing a group project or Facebook group.

The answer to those questions is that but we really don’t know.

ISOGG has prepared a summary page addressing GDPR from the genetic genealogy perspective, here. The ISOGG working group has done an excellent job in summarizing the questions, requirements and potential effects of the legislation in the slide presentation, which I suggest you take the time to view.

This legislation clearly wasn’t written considering this type of industry, meaning DNA shared for genealogical purposes, and there has been no case law yet surrounding GDPR. No one wants to be the first person to discover exactly how this will be interpreted by the courts.

The requirements for controllers and processors are much the same and include very specific requirements for how data can be stored and what must be done in terms of the “right to be forgotten” requests within a reasonable time, generally mentioned as 30 days after the person who owns the data requests to be forgotten. This would clearly apply to some websites and other types of resources used and maintained by the genetic genealogy community. If you are one of the people this could affect, meaning you maintain a website displaying results of some nature, you might want to consider these requirements and how you will comply. Additionally, you are required to have explicitly given consent for every person’s results that are displayed.

For genetic genealogists, who regularly share information through various means, and the companies who enable this technology, GDPR is having what I would very generously call a wet blanket effect.

What’s Happening in the Genetic Genealogy Space?

So far, we’ve seen the following:

  • Oxford Ancestors has announced they are shuttering, although they did not say that their decision has anything to do with GDPR. The timing may be entirely coincidental.
  • Full Genomes Corporation has announced on social media that they are no longer accepting orders from EU or UK customers, stating that “the regulatory cost is too high for a small company” and is “excessive.” I would certainly agree with that. Update; On 3-31-2018 Justin Loe, CEO of Full Genomes says that they “will continue to sell into the EU via manual process.”
  • Ancestry has recently made unpopular decisions relative to requiring separate e-mails to register different accounts, even if the same person is managing multiple DNA kits. Ancestry did not say this had to do with GDPR either, but in reading the GDPR requirements, I can understand why Ancestry felt compelled to make this change.
  • Family Tree DNA recently removed a search feature from their primary business page that allowed the public to search for their ancestors in trees posted to accounts at Family Tree DNA. According to an e-mail sent to project administrators, this change was the result of changes required by GDPR. They too are working on compliance.
  • MyHeritage is as well.
  • I haven’t had an opportunity to speak privately with LivingDNA or 23andMe, but I would presume both are working on compliance. LivingDNA is a UK company.

One of my goals recently when visiting RootsTech was to ask vendors about their GDPR compliance and concerns. That’s the one topic sure to wipe the smile off of everyone’s face, immediately, generally followed by grimaces, groans and eye-rolls until they managed to put their “public face” back on.

In general, vendors said they were moving towards compliance but that it was expensive, difficult and painful – especially given the ambiguity in some of the regulation verbiage. Some expressed concerns that GDPR was only a first step and would be followed by even more painful future regulations. I would presume that any vendor who is not planning to become compliant would not have spent the money to have a booth at RootsTech.

The best news about GDPR is that it requires transparency – in other words, it’s supposed to protect customers from a company selling your anonymized DNA out the back door without your explicitly given consent, for example. However, the general consensus was that any company that wanted to behave in an unethical manner would find a loophole to do so, regardless of GDPR.

In fairness, hurried consumers bring this type of thing on themselves by clicking through the “consent,” or “agree” boxes without reading what they are consenting to. All the GDPR in the world won’t help this. The company may have to disclose, but the consumer doesn’t have to read, although GDPR does attempt to help by forcing you to actively click on agree.

I’m sure we’ll all be hearing more about GDPR in the next few weeks as the deadline looms ever closer.

May 25, 2018

Now you know!

There’s nothing you can do about the effects of GDPR, except hold on tight as the vendors on which we depend do their best to navigate this maze.

Between now and May 25th, and probably for some time thereafter, I promise to be patient and not to complain about glitches in vendors’ systems as they roll out new code as seamlessly as possible.

Gluttons for Punishment

For those of you who are really gluttons for punishment, here are the actual links to the documents themselves. Of course, they are also guaranteed to put you to sleep in about 27 second flat…so a sure cure for insomnia.

_____________________________________________________________________

Standard Disclosure

This standard disclosure appears at the bottom of every article in compliance with the FTC Guidelines.

Hot links are provided to Family Tree DNA, where appropriate. If you wish to purchase one of their products, and you click through one of the links in an article to Family Tree DNA, or on the sidebar of this blog, I receive a small contribution if you make a purchase. Clicking through the link does not affect the price you pay. This affiliate relationship helps to keep this publication, with more than 900 articles about all aspects of genetic genealogy, free for everyone.

I do not accept sponsorship for this blog, nor do I write paid articles, nor do I accept contributions of any type from any vendor in order to review any product, etc. In fact, I pay a premium price to prevent ads from appearing on this blog.

When reviewing products, in most cases, I pay the same price and order in the same way as any other consumer. If not, I state very clearly in the article any special consideration received. In other words, you are reading my opinions as a long-time consumer and consultant in the genetic genealogy field.

I will never link to a product about which I have reservations or qualms, either about the product or about the company offering the product. I only recommend products that I use myself and bring value to the genetic genealogy community. If you wonder why there aren’t more links, that’s why and that’s my commitment to you.

Thank you for your readership, your ongoing support and for purchasing through the affiliate link if you are interested in making a purchase at Family Tree DNA, or one of the affiliate links below:

Affiliate links are limited to:

Who Tests the X Chromosome?

Recently, someone asked which of the major DNA testing companies test the X chromosome and which ones use the X in matching. How does this difference influence the quality of our matches?

Vendor X in Download File Uses X in Matching X Included in Total cM Count
23andMe Yes Yes Yes
Family Tree DNA Yes Yes (if have a match on another chromosome) No
Ancestry Yes *No No
MyHeritage Yes No No
GedMatch N/A Separately No

*If Ancestry did utilize the X in matching, it wouldn’t benefit customers because Ancestry does not show segment information by chromosome.  In other words, no chromosome browser.

Family Tree DNA includes any size X match IF and only if the two people already match on a different chromosome.

GedMatch, of course, isn’t a vendor who does DNA testing, so they don’t provide download files.  They are solely on the receiving end.

X CentiMorgan Counts

Due to variations in the way vendors calculate matches and total cM counts, your mileage may vary a bit.

In other words, the 23andMe cM total, if an X match is involved, may be slightly more than a match between the same two people at Family Tree DNA, where the X match cM is not included in the cM total.

Conversely, you won’t show an X match with someone at Family Tree DNA if there isn’t also another segment on a different chromosome that matches.

In general, due to the thin spread of SNPs on the X chromosome, you will need, on average, a cM match that is twice as large as on other chromosomes to be considered of equal weight.

In other words, a 10 cM match on the X chromosome would only be genealogically equivalent to approximately a 5 cM match on any other chromosome.

X matches really can’t be evaluated by the same rules as other chromosomes due both to their SNP paucity and their inheritance path, which is why most vendors don’t include those segments in the total cM count.

X Matches

While including the X chromosome cM count is problematic, X matching can be a huge benefit because of the unique inheritance path of the X chromosome.

In the article, X Marks the Spot, we discussed the inheritance path of the X chromosome for both males and females. Females inherit an X chromosome from both father and mother, which recombines just like chromosomes 1-22.  However, men only inherit an X from their mother, because they inherit a Y from their father instead of the X.  Therefore, males will only inherit an X from their mother, and females will only inherit their father’s mother’s X chromosome.

Charting Companion software works with your genealogy software of choice to produce a lovely fan chart where the contributors of my X chromosome are charted in color, above. You can read more about Charting Companion here.

The great news is that if you and a match share a significant portion of the X chromosome, meaning more than 15 cM which reduces the likelihood of an identical by chance match, the common ancestor (on that segment) has to come from an ancestor in your direct X path.

I’m always excited to see with whom I share an X.  That piece of information alone helps me focus my ancestor detective efforts on a specific portion of my tree.

Some X segments can remain intact for generations and may be very old.  So don’t be surprised if the common ancestor of the X segment and another matching segment may not be the same ancestor.

Sorting by X

I wasn’t able to find a way to sort by X chromosome matches at 23andMe, but you can sort by the X at both Family Tree DNA and GedMatch.

At GedMatch, X matching shows on the one-to-many match page.  You can sort by either Total X cM or Largest X cM by using the up and down arrows, at right, below, in the X DNA columns.

After you identify an X match, be sure to run the X one-to-one match option to verify.

My GedMatch matches cause me to wonder if 23andMe is using a different reporting threshold for the X chromosome, because one of my matches at GedMatch is a close family member with no X match at 23andMe, but a total of 32 X cM and with a longest segment of 14 X cM at GedMatch.

That same individual matches me with the largest X segment of 14 cM at Family Tree DNA as well.

Family Tree DNA X Match Phasing

At Family Tree DNA, on your Family Finder matches page, just click on the X-Match header (at right, below) to bring all of your X matches to the top of your list.

If you have linked any kits of relatives to your tree, you will see numbers of phased kits on the maternal and paternal tabs with the red and blue male and female icons. In the example above, I have 3313 matches total, with 744 being paternal, 586 being maternal.

Next, click on the maternal or paternal tab to see only the people with X matches who match you on the  your maternal and paternal lines. Matches are automatically sorted into maternal and paternal “buckets” for you. Remember to check the size of the X match before deciding about relevance.

Who is your largest X match that you don’t already know?  Maybe you can find your common ancestor today.

Have fun!!!

_____________________________________________________________________

Standard Disclosure

This standard disclosure appears at the bottom of every article in compliance with the FTC Guidelines.

Hot links are provided to Family Tree DNA, where appropriate.  If you wish to purchase one of their products, and you click through one of the links in an article to Family Tree DNA, or on the sidebar of this blog, I receive a small contribution if you make a purchase.  Clicking through the link does not affect the price you pay.  This affiliate relationship helps to keep this publication, with more than 900 articles about all aspects of genetic genealogy, free for everyone.

I do not accept sponsorship for this blog, nor do I write paid articles, nor do I accept contributions of any type from any vendor in order to review any product, etc.  In fact, I pay a premium price to prevent ads from appearing on this blog.

When reviewing products, in most cases, I pay the same price and order in the same way as any other consumer. If not, I state very clearly in the article any special consideration received.  In other words, you are reading my opinions as a long-time consumer and consultant in the genetic genealogy field.

I will never link to a product about which I have reservations or qualms, either about the product or about the company offering the product.  I only recommend products that I use myself and bring value to the genetic genealogy community.  If you wonder why there aren’t more links, that’s why and that’s my commitment to you.

Thank you for your readership, your ongoing support and for purchasing through the affiliate link if you are interested in making a purchase at Family Tree DNA, or one of the affiliate links below:

Affiliate links are limited to:

DNAGedcom Client

DNAGedcom provides an incredibly cool tool that has helped me immensely with my genealogy research, particularly at Ancestry and Family Tree DNA. This tool doesn’t replace what Ancestry and Family Tree DNA provide, but augments the functionality significantly.

I’ve been frustrated for months by the broken search function at Ancestry, and the DNAGedcom tool allows you to bypass the search function entirely by downloading the direct line ancestral information for all of your matches. So let’s use my Ancestry account as an example.

Utilizing DNAGedcom

After installing the DNAGedcom tool on your system, sign on to your Ancestry account through the tool. The tool downloads all of your matches, the people you match in common with them, and the ancestors in your matches’ trees.

The best part about this is that the results are then in a spreadsheet file that you can simply sort utilizing normal spreadsheet functions. I wrote about using spreadsheets for genetic genealogy in the article, Concepts – Sorting Spreadsheets for Autosomal DNA.

In my case, this means I can see everyone who I match that has an Estes, or any other surname, in their tree. I don’t have to look at my matches’ trees one at a time.

You can read about this very cool tool at this link, including how to subscribe for either $5 per month or $50 per year. Many functions at DNAGedcom are free, but the Ancestry tool is available through a minimal subscription which helps to support the rest of the site.

After subscribing, the DNAGedcom client will become available to you on your subscriber page at DNAGedcom.

Please note that you can click to enlarge any image.

After you subscribe, you’ll see the link for the Ancestry download tool, along with other resources.

You will want to follow the installation directions, exactly, to download the DNAGedcom client onto your PC or Mac in preparation for downloading your Ancestry match information onto your system. This is painless and goes quickly.

Next, you will be prompted to sign in to both DNAGedcom and Ancestry, through the tool, and then you will be prompted for three separate steps at Ancestry:

  • Gather Matches – took about 10 minutes
  • Gather Trees – let’s just say you might want to run this one overnight, and on a directly connected system, not wifi. Mine was about 25% complete at the 2 hour mark
  • Gather ICW – another several hours, but you can do other things on your system at the same time

The downloaded files will be stored on your computer as .csv files. On my PC, the default location was in the Documents directory and the files are named as follows:

  • a_Roberta_Estes (the ancestors of my matches)
  • icw_Roberta_Estes (the people I match and who I match in common with them)
  • m_Roberta_Estes (information about the match, such as cMs, etc.)

It’s important to make a note of this, as I didn’t find the file names documented elsewhere.

The good news is that even though these steps take a long time, having all of this information in a place where you can sort it and use it effectively is extremely useful. You can run the various steps at night or when you aren’t otherwise using your system.

In addition, if someone is sharing their DNA results with you on Ancestry (which they can under the settings gear), you can download the same data for their account – and then you can look for commonalities between groups of results using the DNAGedcom Match-O-Matic tool, also described in the introductory document.

Using the Downloaded Files

Personally, what I wanted to do was to search for all occurrences of a particular surname. Fortunately, it was Claxton or Clarkson, not Smith.

Simply using Excel (after saving the results file in Excel format), I was able to quickly sort for these surnames, an example shown below. Hmmm, I wonder if Claxon is relevant too. I never considered that possibility – nor would I have ever seen Claxon in a surname search, because I wouldn’t have searched for Claxon..

I’m brick walled on the Claxton line in Russell County, Virginia in about 1799. My ancestor, James Lee Claxton, was born someplace in Virginia about 1775. Utilizing Y DNA, we know of another man, also named James Claxton, born about 1750 first found in Granville and Bertie County, NC, who sired an entire lineage of Claxtons who migrated to Bedford County, TN.  However, that James is not the father of my ancestor, because that James had a different son named James. Other than these two distinct groups, we can’t seem to match with anyone else who has tested their Y DNA at Family Tree DNA, so my hope, for now, is an autosomal match with a known Claxton line out of Virginia.

(Shameless plug – if you are a Claxton or Clarkson male, please test your Y DNA at Family Tree DNA and join the Claxton DNA project. If you have Claxton or Clarkson ancestry from any line, and have taken the Family Finder test or transferred autosomal results from another vendor, please join the Claxton/Clarkson DNA project at Family Tree DNA. If you have Claxton or Clarkson ancestry and haven’t yet DNA tested, please do.)

Therefore, my goal is to find matches to other Claxton or Clarkson individuals who don’t share a known common known ancestor with me. Because we don’t share a known common ancestor, of course, these people would never be shown as an Ancestry green leaf “DNA+tree match,” nor is there another way for me to obtain a surname list like this at Ancestry.

After finding Claxton candidates, then I can refer to the other downloaded files or sign on to my account at Ancestry to look at the match itself and other ICW matches. Hopefully, some of my matches will also match some of my Claxton cousins as well, which would suggest that the match might actually be through the Claxton line.

The DNAGedcom client also downloads the same type of information from 23andMe, which isn’t nearly as useful without trees, as well as from Family Tree DNA.

Thanks so much to www.dnagedcom.com.

_____________________________________________________________________

Standard Disclosure

This standard disclosure appears at the bottom of every article in compliance with the FTC Guidelines.

Hot links are provided to Family Tree DNA, where appropriate. If you wish to purchase one of their products, and you click through one of the links in an article to Family Tree DNA, or on the sidebar of this blog, I receive a small contribution if you make a purchase. Clicking through the link does not affect the price you pay. This affiliate relationship helps to keep this publication, with more than 850 articles about all aspects of genetic genealogy, free for everyone.

I do not accept sponsorship for this blog, nor do I write paid articles, nor do I accept contributions of any type from any vendor in order to review any product, etc. In fact, I pay a premium price to prevent ads from appearing on this blog.

When reviewing products, in most cases, I pay the same price and order in the same way as any other consumer. If not, I state very clearly in the article any special consideration received. In other words, you are reading my opinions as a long-time consumer and consultant in the genetic genealogy field.

I will never link to a product about which I have reservations or qualms, either about the product or about the company offering the product. I only recommend products that I use myself and bring value to the genetic genealogy community. If you wonder why there aren’t more links, that’s why and that’s my commitment to you.

Thank you for your readership, your ongoing support and for purchasing through the affiliate link if you are interested in making a purchase at Family Tree DNA.

2018 Resolution – Unveiling Hidden Evidence

I spent New Year’s Eve, doing what I’ve done for years on New Year’s Eve – celebrating by researching. In fact, it was at the stroke of midnight in 2005 that I ordered kit number 50,000 from Family Tree DNA.  Yes, I’m just that geeky and yes, I had to purchase several kits in a row to get number 50,000.

That kit went on to help immensely, as I used it to test an elderly cousin of my great-grandmother’s generation who took both the Y DNA test, and then, eventually, autosomal.

This year I made a wonderful discovery to mark the new year.  But first, let’s see how I did with last year’s resolution.

Last Year’s Resolution

Last year, I made 1 resolution. Just one – to complete another year’s worth of 52 Ancestor stories.

Now, that didn’t mean I had to do 52 in total.  It meant I had to be committed to this project throughout the year.  You know, unlike cleaning out that closet…or losing weight…or exercising more. Commitments that are abandoned almost as soon as they are made.

So, how did I do?

I published 37 stories.  I shudder to think how many words or even pages that was.  I’m ashamed to say that I plucked much of the “low hanging fruit” early on, so these were tough ancestors for an entire variety of reasons.

That’s not one article each week, but at least I’m making steady progress. And I must say that I couldn’t do it without a raft of helpers – all of whom I’m exceedingly grateful to.  Friends, professionals, cousins, DNA testers, blog subscribers and commenters – an unbelievable array of very kind souls who are willing to give of their time and share their results. Thank you each and every one!

Now, I’m thrilled to tell you that Amy Johnson Crow has revitalized the 52 Ancestor’s project.  It’s free and you can sign up here.  There’s no obligation, but Amy provides suggestions and a “gathering place” of sorts. Think of her as your genealogy cheerleader or coach. It’s so much easier with friends and teammates! I miss reading other people’s stories, but I won’t have to miss that much longer!

Randy Seaver (of Genea-Musings) and I will have company once again.  He’s the only other person that I’m aware of that has continued the 52 Ancestors project – and he has put me way to shame.  I do believe he published number 286 this week.  I keep hoping that some of his ancestors and some of mine are the same so I can piggyback on Randy’s research! I need an index! Randy, are you listening?

You might wonder why I enjoy this self-imposed deadline ancestor-writing so much.

It’s really quite simple.  It’s an incredible way to organize and sort through all of your accumulated research “stuff.”  I cherish the end product – documenting my ancestors lives with dates, compassion and history.  BUT, I absolutely hate parts of the research process – and the deadline (of sorts) gets me through those knotholes.

I absolutely love the DNA, and I really, REALLY like the feeling of breaking through brick walls.  It’s like I’m vindicating my ancestors and saving them from the eternal cutting room floor. DNA is an incredible tool to do just that and there are very few ancestors that I can’t learn something from their DNA, one way or another – Y, mtDNA,  autosomal and sometimes, all three.  And yes, DNA is in every one of my articles, one way or another. I want everyone to learn how to utilize DNA in the stories of their ancestor’s lives.  In many cases the DNA of theirs that we (and our cousins) carry is the only tangible thing left of them. We are wakling historical museums of our ancestral lines!

How Did You Do?

Not to bring up an awkward subject, but if you recall, I asked you if you had any genealogy resolutions for 2017?  How did you do?

Congratulations if you succeeded or made progress.

It’s OK if you didn’t quite make it. Don’t sweat last year.  It’s over and 2018 is a brand spanking new year.

New Year Equals New Opportunities

2018 is stacking up to be a wonderful year. There are already new matches arriving daily due to the Black Friday sales and that’s only going to get better in the next month or two.  Of course, that’s something wonderful to look forward to in the dead of winter.  We’ll just call this my own personal form of hibernating. Could I really get away with not leaving my house for an entire month? Hmmm….

I want to give you three ideas for having some quick wins that will help you feel really great about your genealogy this year.

Idea 1 – Finding Hidden Mitochondrial DNA

This happened to me just last night and distracted me so badly that I actually was late to wish everyone a Happy New Year.  Yes, seriously.  One of my friends told me this is the best excuse ever!

I was working on making a combined tree for the descendants of an ancestor who have tested and I suddenly noticed that one of the female autosomal matches descended from the female of the ancestral couple through all females – which means my match carries my ancestor’s mitochondrial DNA!

Woohooooooo – it’s a wonderful day.

Better yet, my match tested at Family Tree DNA AND had already taken the mitochondrial DNA test.

Within about 60 seconds of noticing her pattern of descent, I had the haplogroup of our common ancestor. That’s the BEST New Year’s gift EVER.  I couldn’t sleep last night.

So, know what I did instead of sleeping? I bet you can guess!

Yes indeed, I started searching through my matches at Family Tree DNA for other people descended from female ancestors whose mtDNA I don’t have!

So, my first challenge to you is to do the same.

Utilizing Family Finder, enter the surname you’re searching for into the search box in the upper right hand corner of your matches page.

That search will produce individuals who have that surname included in their list of ancestral surnames or who carry that surname themselves.

Your tree feeds the ancestral surname list with all of the surnames in your tree.  I understand this will be changing in the future to reflect only your direct line ancestral surnames.

Some people include locations with their surnames – so you may recognize your line that way. Click on your match’s surname list (at far right) to show their entire list of surnames in a popup box. Some lists are very long.  I selected the example below because it’s short.

Your common surnames are bolded and float to the top.  The name you are searching for will be blue, so it’s easy to see, especially in long lists of surnames. 

About half of my matches at Family Tree DNA have trees.  Click on the pedigree icon and then search for your surname of interest in your match’s tree.

Hey, there’s our common ancestral couple – William George Estes and Ollie Bolton!!!

Idea 2 – Finding Hidden Y DNA

Now that I’ve shown you how to find hidden mitochondrial DNA, finding hidden Y DNA is easy.  Right?

You know what to do.

I this case, you’ll be looking for a male candidate who carries the surname of the line you are seeking, which is very easy to spot on the match list.

Now, word of warning.

As bizarre as this sounds, not all men who carry that surname and match autosomally are from the same genetic surname line.

As I was working with building a community tree for my matches last night, I was excited to see that one of my cousins (whose kit I manage) matches a man with the Herrell surname.

I quickly clicked on the match’s tree to see which Herrell male the match descends from, only to discover that he didn’t descend from my Herrell line.

Whoa – you’re saying – hold on, because maybe my line is misidentified.  And I’d agree with you – except in this case, I have the Y DNA signature of both lines – because at one time I thought they were one and the same. You can view the Herrell Y DNA project here.  My family line is Harrold Line 7.

Sure enough, through the Family Finder match, I checked my Harrell match’s profile and his haplogroup is NOT the same as my Herrell haplogroup (I-P37.)

I could have easily been led astray by the same surname. I really don’t need to know any more about his Y DNA at this point, because the completely different haplogroup is enough to rule out a common paternal line.

Don’t let yourself get so excited that you forget to be a skeptical genealogist😊

My second challenge to you is to hunt for hidden Y DNA.

You can  increase your chances of finding your particular lineage by visiting the relevant Y DNA projects for your surname.

Click on Projects, then “Join a project,” then search for the DNA project that you’re interested in viewing and click on that link.

Within the project, look for oldest ancestors that are your ancestors, or potentially from a common location.  It’s someplace to start.

You can read more about how to construct a DNA pedigree chart in the article, “The DNA Pedigree Chart – Mining for Ancestors.”

Idea 3 – Pick A Puzzle Piece

Sometimes we get overwhelmed with the magnitude and size of the genealogy puzzle we’d like to solve. Then, we don’t solve anything.

This is exactly WHY I like the 52 Ancestor stories.  They make me focus on JUST ONE ancestor at a time.

So, for 2018, pick one genealogy puzzle you’d really like to solve. One person or one thing.  Not an entire line.

Write down your goal.

“I’d like to figure out whether John Doe was the son of William Doe or his son, Alexander Doe.”

Now admittedly, this is a tough one, because right off the bat, Y DNA isn’t going to help you unless you’re incredibly lucky and there is a mutation between Alexander Doe and his father, William.  If indeed that was the case, and you can prove it by the DNA of two of Alexander’s sons who carry the mutation, compared to the DNA of one of William’s other sons who does not, then you may be cooking with gas, presuming you can find a male Doe descended from John to test as well.

This is the type of thought process you’ll need to step through when considering all of the various options for how to prove, or disprove, a particular theory.

Make a list of the different kinds of evidence, both paper trail and genetic, that you could use to shed light on the problem. Your answer may not come from one piece of evidence alone, but a combination of several.

Evidence Available/Source Result
William’s will No, burned courthouse Verified
Alexander’s will No, burned courthouse Verified
Deeds with William as conveyor No, burned courthouse Verified
Family Bible Nope, no Bible
Deeds with Alexander as conveyor, naming John Possible, some deed books escaped fire Check through county, Family search does not list
Deeds with John as conveyor Yes, check to see if they indicate the source of John’s land John is listed in index, need to obtain original deeds from county
Y DNA of John’s line Yes, has been tested Matches DNA of William’s line as proven through William’s two brothers
Y DNA of Alexander Not tested (to the best of my knowledge), find descendant to see if they will test Search vendor DNA testing sites for male with this surname to see if they have/will Y DNA test
Closeness (in total cM and longest segment) of individuals autosomal matching through any of William’s descendants Mine both Ancestry and FTDNA for surname and ancestor matches This step may produce compelling or suggestive evidence, and it may not.  Make a McGuire chart of results.
Does John match any relatives of the wife of Alexander Doe? Search FTDNA and Ancestry for matches.  Triangulate to determine if match is valid and through that line. This is one of the best approaches to solve this type of problem when paper records aren’t available. Fingers crossed that Alexander and his wife and not related.

You can add pieces of evidence to your list as you think of them.

Making a list gives you something to work towards.

Your Turn!

Select one thing that you’d like to accomplish and either set about to do it, like mining for mitochondrial or Y DNA evidence, or put together a plan to gather evidence, both traditional and genetic.

In the comments, share what it is you’ll be searching for or working on.  You just never know if another subscriber may hold the answer you seek.

I can’t wait to hear what you’ll be doing this year!

Have a wonderful and productive New Year searching for those hidden ancestors!

_____________________________________________________________________

Standard Disclosure

This standard disclosure appears at the bottom of every article in compliance with the FTC Guidelines.

Hot links are provided to Family Tree DNA, where appropriate.  If you wish to purchase one of their products, and you click through one of the links in an article to Family Tree DNA, or on the sidebar of this blog, I receive a small contribution if you make a purchase.  Clicking through the link does not affect the price you pay.  This affiliate relationship helps to keep this publication, with more than 900 articles about all aspects of genetic genealogy, free for everyone.

I do not accept sponsorship for this blog, nor do I write paid articles, nor do I accept contributions of any type from any vendor in order to review any product, etc.  In fact, I pay a premium price to prevent ads from appearing on this blog.

When reviewing products, in most cases, I pay the same price and order in the same way as any other consumer. If not, I state very clearly in the article any special consideration received.  In other words, you are reading my opinions as a long-time consumer and consultant in the genetic genealogy field.

I will never link to a product about which I have reservations or qualms, either about the product or about the company offering the product.  I only recommend products that I use myself and bring value to the genetic genealogy community.  If you wonder why there aren’t more links, that’s why and that’s my commitment to you.

Thank you for your readership, your ongoing support and for purchasing through the affiliate link if you are interested in making a purchase at Family Tree DNA, or one of the affiliate links below:

Affiliate links are limited to: