The Science Behind the Golden State Killer – Insitome Podcast

Please join Spencer Wells, Founder and CEO of Insitome, former Director of the Genographic Project and Explorer in Residence at National Geographic, Razib Khan, Director of Scientific Content at Insitome and yours truly as we discuss the science behind the Golden State Killer case.

I would like to thank Spencer and Razib for inviting me to join them today. It was fun discussing the case itself and the possible ramifications to this entire industry. I was going to add, “in the future,” but the future is here.

The Golden State Killer case is remarkable because of the combined techniques used to solve the crime which include DNA, genealogy and associated data bases in addition to traditional investigative work.

As Spencer Wells says in the podcast, this case is “Sherlock Holmesian.” What a movie this will make one day!

I wrote about this topic a few days ago in the article, The Golden State Killer and DNA.

How did all of these techniques work together to identify a suspect? How does the actual science work? Is it accurate? Are there issues? What about privacy concerns with more than 17 million people having already participated in direct to consumer testing?

Yes, more than 17 million at the end of 2017 – probably more than 20 million now and maybe 30 million by year end. Razib weighs in on how many is enough for forensic testing.

Learn about the underlying science and hear what Spencer and Razib, both geneticists, have to say.

Please join us at any of the following links:

For those who might not be aware, Spencer’s company, Insitome, doesn’t offer DNA testing for matching, so can’t be used for law enforcement purposes.

Insitome does offer Neanderthal, Regional Ancestry and Metabolism DNA testing.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

The Golden State Killer and DNA

Joseph DeAngelo, 2018 mugshot, alleged Golden State Killer

Unless you’ve been living under a rock for the past few days, you already know that the Golden State Killer has, it appears, been apprehended by:

  1. Sequencing DNA from the original crime scene
  2. Uploading those results to a genealogy data base to utilize techniques currently used for unknown parent searches to suggest or identify the killer
  3. Then, to confirm that they had identified the right person, discarded DNA from the suspect was sequenced which apparently matched the original DNA from the crime scene

I say “it appears” because remember, until he’s convicted, Joseph DeAngelo is still a suspect.

I have received more messages, texts and e-mails about this one topic than any other, ever. My phone has been buzzing like an angry bee with too much caffeine for days.

Unfortunately, in many news articles, the topic suffers from dramatic over-simplification at best and significant errors at worst. This combined with lots of fear stirs a toxic brew.

In almost all cases, the author writing the article clearly didn’t understand the subject matter at hand. Many leaders in the genetic genealogy community have been asked for comment. Having had more than one situation in which I was misquoted or my quote was taken out of context, I am discussing the issue in this article, where my comments aren’t boiled down to a one sentence sound bite. I don’t want anyone making a knee-jerk reaction with partial information. This topic deserves, and must receive much more discussion in a calm, informed manner.

There is a great deal of concern, curiosity, misinformation and incorrect assumptions in the genetic genealogy community as well as the media, along with emotions running at high tide.

I think it’s important to do three things:

  1. Discuss what actually happened.
  2. Discuss how genealogy versus both unknown parent and forensic searching differs from genealogy searching.
  3. Discuss associated concerns.

The Case

The Golden State Killer has been accused of at least 12 murders, more than 50 rapes and many burglaries primarily from June 1975 through May 1986. DNA evidence was collected, but DNA testing at that time had not progressed to the point where the culprit was able to be identified by utilizing his DNA.

A lot has changed, both in terms of DNA technology and other resources available since that time.

Last week, on April 25th, Joseph DeAngelo, now in his 70s, was arrested after DNA matching implicated him as the Golden State Killer. The news is ripe with stories, but this NPR article is a good summary as are the references at that bottom of the wiki article linked above.

Initial Concerns

Initially, two questions were being asked.

  • Which genetic genealogy company “cooperated” with law enforcement?
  • Did law enforcement have a search warrant?

As it turns out, the answer is that no testing companies “cooperated” and that no  search warrant was needed.

The next question was, “How safe is my DNA?”

Let’s talk about what happened, how it was done and how it affects each of us.

Disclosure

I was not involved with this or any similar case in any capacity, although I have been working the past few days to ferret out what actually happened, including discussing this privately and in public forums.

However, I am familiar with the techniques used as a result of my involvement with archaeological digs and ancient DNA, and I’d like to discuss what actually happened, as best we can unravel to date.

DNA Collection

At the time of the rapes and murders committed by the Golden State Killer, one police officer froze extra samples of the evidence, just in case, for the future. That future has arrived.

In the past few years, whole genome sequencing of ancient DNA and degraded samples has become possible. Probably the most notable are the Neanderthal and Denisovan genome reconstructions, beginning in 2010, but sequencing of forensic samples has become commonplace in the past few years.

From those ancient sequences, as long ago as September 2014, whole genome sequences were being reduced to just the DNA locations supported by GedMatch and the resulting compatible files uploaded there for comparison to other testers. This was possible because the raw data files are made available to testers by testing companies, so testers can modify the files in any way they see fit without the cooperation or involvement of any lab or company.

More ancient samples were added to GedMatch in the following months, and the ancient DNA comparison feature continues to be quite popular. No one ever thought much about it, but there is absolutely no reason that same technique couldn’t be used for other samples, and indeed, now it has.

Just 13 days before the arrest of DeAngelo, another homicide was solved by DNA sequencing. A murder victim, known as Buckskin Girl, found in 1981 was identified as Marcia Lenore King.

According to the non-profit Doe project, whole genome sequencing was performed, the file reduced to a format needed for GedMatch, and the file uploaded.

Again, there was no public outcry – possibly because a victim had been identified and not a criminal suspect, and because the event was not as widely publicized. However, it’s also possible that if the Buckskin Girl’s murderer left DNA evidence on the body, that sequencing could have identified both the victim and the murderer.

The identification of Buckskin Girl, however, did spur non-public debate within the leadership of the genetic genealogy field. Little did we know that the next case would follow dramatically in just two weeks.

GedMatch Matching

GedMatch is an open data base created in 2011 by two individuals in order to facilitate open sharing of autosomal matching between people, even if they tested at different companies.

Of the DNA testing companies, at that time, only 23andMe and Family Tree DNA provided centiMorgan information, recently joined by MyHeritage. Ancestry does not provide this information to their clients, so if an Ancestry client wants to see how they match other individuals in terms of actual chromosome locations and centiMorgans, they must transfer to either Family Tree DNA, GedMatch or now, MyHeritage.

Because GedMatch, with few exceptions during periods of change, matches customers from every vendor against customers from every other vendor, at least partially, they have become the clearing house for many people, especially Ancestry customers who don’t have the chromosome comparison option natively at Ancestry.

I want to be VERY clear about what you can and cannot see and do at GedMatch.

You can see your matches by the name they have entered, which can be an alias, along with their e-mail and how you match them. You CANNOT see the information of anyone you don’t match, unless you utilize another person’s kit number to see who they match. This has always been how GedMatch functions.

GedMatch users do NOT have access to your actual DNA file – ever. They can see who they match, and if they have your kit number, they can see who you match as well. Here’s an example of my own match screen.

Note – typically when showing GedMatch screen shots, I would blur the kit numbers and names in keeping with good privacy practices. However, since the point is to show you what one can actually see, I haven’t, because the top two matches are my own kits from Ancestry and 23andMe, and the third kit is that of my deceased mother whose kit I now manage. I also want to demonstrate that truly, there is nothing frightening or threatening about the information your matches see about you.

Best Matches

From a genealogist’s perspective, your “best matches” are to known close relatives, because when you match that relative and another person, especially on the same DNA segment, it’s a good indication that you share a common ancestor further back in time.

Genealogists build “clusters” of those types of matches in order to prove a relationship to a common ancestor. This is the heart and soul of DNA matching for genealogy.

For example, someone who matches you and your first cousin, both, on the same rather large segment assuredly shares a common ancestor with you and your cousin someplace in the past. The genealogical goal, of course, is to identify that long-deceased ancestor.

For example, if you match a first cousin, you know that your most recent common ancestor is one of your two sets of grandparents. Most genealogy matches are further back in time than either first or second cousins, making the identification of the common ancestor more challenging. Discovering that common ancestor is the goal of the game – because these matches to people with the same ancestor in their tree (generally) confirm that your ancestor is accurately identified. Some matches solve long-time family mysteries and break down brick walls.

However, not all brick walls are in the past.

Adoptee and Parental Search Matching

A few years ago, genealogists attempting to find unknown parents for adoptees and people with unknown fathers noticed that there were matching patterns to be followed successfully.

With millions of people having tested today, it’s much easier than it was a few years ago to find that key match (or matches) that reveals or confirms the identity of either an ancestor or an unknown parent.

While both genealogists and unknown parent searches look for close matches, the techniques diverge at that point.

Genealogists use a first or second cousin match to move backwards in time, looking for common distant ancestors.

In unknown parent searches, the same genealogical technique is used, EXCEPT, the person doing the searching could care less about older ancestors, such as great-grandparents. They are looking for their immediate ancestors – their parents.

Therefore, when an adoptee finds that critical first cousin match, they aren’t interested in figuring out a common ancestor for genealogy, meaning going backward in time. They covet that first cousin match for the purpose of coming forward in time, meaning towards the present in order to identify parents.

If you match to someone as a first cousin, you share a common set of grandparents. You can’t tell, without additional information, which set of grandparents, but given that you do match as a first cousin, there are only two positions the match can have in your family – either the pink or blue person above. This means that either your father or mother was a sibling to your first cousin’s parent.

You either share your father’s parents with your first cousin, or your mother’s parents, but you don’t know which – at least not yet.

With that much information, it’s fairly easy to uncover the rest. After all, you only have two sets of grandparents and anyone who is your first cousin will point to one of those two sets of grandparents.

You need to figure out who else matches you AND your first cousin, and then look at the genealogy of everyone who matches in that group until you discover the name of common family members/ancestors that you recognize, meaning an ancestor on either your maternal or paternal side to confirm that your first cousin matches you on that line.

Of course, for people who know their parents, figuring out first cousins is easy and takes about 2 seconds – but not so much for adoptees. Adoptees look to see how people who match them also match each other. For example, does the same couple or ancestor appear in the trees of multiple matches? In the example below, if the tester matches all three blue people as first cousins, the name of the blue cousins’ grandparents would be the same, suggesting that the tester’s grandparents were that same couple.

Next, it’s necessary to figure out which people who descend from the common set of grandparents might be candidates to be the parent the tester is seeking. In the example below, we’ve expanded the side of the three blue first cousin matches, adding their parents’ siblings as parent candidates for our tester. Factors such as age and location at the time of conception are taken into consideration when focusing on parent candidates.

If the tester doesn’t know who their parents are, they would be VERY interested in determining ALL of the children of the grandparents of their first cousin. Because one of the children of their first cousin’s grandparents IS THEIR PARENT.

In our example above, let’s just look at one of the grandparent pairs of the blue first cousins. The first cousins know who their grandparents are. The tester does not. In this case either the father or mother of the tester is the child of the first cousin’s grandfather and grandmother. Meaning that the red mother is the female child of the grandparents, or the green father is the male child of the grandparents.

We know that the grey parents of the first cousin matches can be eliminated as the tester’s parent. If the first cousin’s parent was also the parent of the tester, then the first cousin wouldn’t be a first cousin, but would be a full or half sibling.

However, the matching first cousins’ parents have three siblings who have not DNA tested, nor have their children, shown in pink and green. One of those three siblings IS either the father or mother of the tester. Of course, if the grandparents didn’t have any female children, then the tester’s father is one of the green male children of the grandparents, and vice versa.

In the example shown below, the tester’s mother IS the female child of the grandfather/grandmother pair and has been moved into place. This would be determined either by direct testing of the pink or green people, or their descendants, or by process of elimination through DNA tests of the other siblings or utilizing other pieces of information such as age and proximity.

Some adoptees are lucky enough to test and discover that a parent has tested and is waiting for them. Sometimes an unsuspected half sibling appears. Sometimes, there is no close match and the adoptee has to do more research work, including tracking people through social media and other means to find candidate family members to DNA test or to see if they know who might have been the much-sought-after parent.

Search Techniques

This type of research work has been taking place for years, individually, through groups like DNAadoption and DNADetectives who utilize volunteer search angels, as well as by several researchers who make a living doing this type of personal search. My focus is not on adoption search cases.

No one has seemed to consider this unethical, even though some of this work, especially when a parent isn’t immediately evident, involves utilizing the DNA of the tester’s matches and their matches’ relatives, connecting the family dots through social media, specifically Facebook pages, to discover the identify of someone who may not welcome that discovery. However, like GedMatch, Facebook, while not intended for this purpose is public and is heavily utilized by adoption searchers.

Some adoption search cases end very well – with heartfelt beautiful reunions welcomed by all parties. Others not so much, potentially upending the life of the biological parent that was established after the adoption took place which leads to a rejection that devastates the adoptee. Much of the damage can be done by the search process itself, meaning that the biological parent is “outed” by the process of people working through relatives who have tested and match in various ways. Of course, they ask questions to identify the biological parent – meaning that by the time the parent is identified they have no say about their own privacy.

Once DNA is uploaded to a data base, the search techniques for biological parent searches and to identify Buckskin Girl and DeAngelo, are exactly the same.

These searches all utilize matches to others, and the matches’ trees, to move forward in time to current to search for contemporary people, not ancestors further back in time.

Back to the Golden State Killer

Ok, back to the Golden State Killer.

We have the killer’s DNA sequence from the original crime scene and the file reduced to the number of DNA locations utilized by GedMatch.

Someone, presumably one of the investigators working on the case, uploaded that file to GedMatch, which appears to be entirely permissible because the police have legal custody of that DNA sample.

Let’s say the investigator, just like a genealogist, found a first cousin match, or even more distant (read difficult) matches further back – and they did exactly what people searching for unknown parents do. The investigator eventually worked through all of the possibilities based on common matches – then looked at age, location, opportunity and factors that might exclude some candidates. In this case, because it’s a rape case with the criminal obviously a male, females would be excluded, for example.

Evidence from DNA matches to the biological sample of the Golden State Killer caused the police to focus on DeAngelo.

After DeAngelo was identified through matches as a suspect, the police obtained his discarded DNA. Discarded DNA could be anything from a coffee cup thrown away to a cigarette butt or something from the trash.

That discarded DNA was sequenced, and a few days before his arrest, uploaded to GedMatch as well. The discarded DNA apparently matched the earlier sample from the killer as “himself” and the other people that the killer matched in the same way – establishing the fact that the Golden State Killer and DeAngelo were one and the same person.

You can see that I match my own 23andme and Ancestry kits as my closest matches in the GedMatch example I showed.

In essence, what the DNA of “the killer” obtained from the crime scene did was to generate leads through matching that allowed the police to identify DeAngelo and obtain a sample of his discarded DNA in order to verify that DeAngelo was the same person as the killer. Of course, he’s still a suspect today, not yet convicted.

Cooperation or Search Warrant

The police, in this case, didn’t need to ask for anyone’s cooperation. They already had the sample from the killer, they did what hundreds of thousands of others have done and simply uploaded the file to GedMatch.

The investigators didn’t need a search warrant because they weren’t asking for anything from GedMatch not already freely given, meaning matches to anyone who has already uploaded their information.

The investigators only used that matching information to generate tips for further investigation. They repeated the entire process with the discarded DNA sample to verify the earlier results obtained with DNA from the crime scene.

It bears noting here that if DeAngelo’s DNA had NOT matched that of the killer and the other people in the same way the killer’s DNA had matched them, then the discarded DNA would have eliminated DeAngelo as a suspect.

So, no genealogy testing company had to cooperate with anyone, nor was a search warrant necessary.

What’s the Rub?

We now have a monster about to be brought to justice. Two weeks earlier, Buckskin Girl, a murder victim, was identified and the family will finally have closure, 37 years later. Both of these are unquestionably wonderful outcomes.

So why are some people upset?

In some cases, people are simply confused about the process involved, and they will be relieved when they understand what actually happened – that their DNA was not “handed over” to anyone.

Some people have broader reaching concerns about privacy.

It appears that the word “police” combined with the word “criminal” caused a great deal of fear and trepidation, especially since a suspect was identified this time, not a victim and not someone’s biological parents.

Some people don’t want their DNA utilized to identify a family member, no matter what that person has done. And yes, that’s very nearly an exact quote from an e-mail I received.

Others are simply uncomfortable with their DNA being used in any kind of a potential criminal setting – even to identify a victim like Buckskin Girl.

One person says that it just makes her feel “creepy.” Oddly enough, that’s how I feel about Facebook now.

If you think it’s fine for adoptees to identify parents using these techniques, but you don’t think it’s alright for victims or criminals to be identified, I’d like to ask you to consider the following scenario.

A underage female is raped and becomes pregnant. She reports the rape to police at the time. She opts to have the child instead of having an abortion, and the child is placed for adoption. The rapist is never caught, and the young woman goes on to establish a new life and marry, not telling her husband or children born to the marriage about the rape, or the child placed for adoption. The expectation of the mother at that time was certainly that “no one would ever know,” whether those words were ever in an adoption contract or not. The fact that adoptions were (and still remain in many places) closed speaks to the expectations set for the mother.

Years pass, and today the adopted child, now an adult, tests. Both of the adoptee’s biological parents are identified through matches to relatives of the adoptee’s parents who have tested, such as first cousins in our earlier example. The adoptees parents themselves did not test.

Results were:

  • The life of the mother, a victim who did nothing wrong or illegal, and who chose to give the child life, is upended through the process of being identified.
  • The father who is a rapist, a criminal, is also identified.
  • The adoptee is subsequently very unhappy with both results for different reasons, but cannot press “undo.”

I’m NOT inferring that these data bases shouldn’t be used for identifying parents. I AM saying that we need to consider that the techniques for identifying parents, victims and criminals are the same. The outcomes are not always positive in parent searches AND these areas are or can be incredibly intertwined. Unraveling or prohibiting one effectively prohibits others. How do we treat everyone fairly and how are those rules, whatever they might be, enforced, and by whom?

In other words, how do we “do no harm”? After all, this started out to be genealogy, a fun hobby, and has now progressed gradually through a slow crawl to something else. Here we sit today.

Consent

In the example rape case above, neither the biological mother nor the father had tested, but their family members had – just like in the Buckskin Girl and the Golden State Killer cases.

Today, relative to the Golden State Killer, people are upset because the database, GedMatch, into which they uploaded their DNA file for genealogy was used for other purposes – specifically to apprehend the Golden State Killer. They feel that isn’t the purpose for which they uploaded their DNA.

Any one of us could have been one of the matches to the Golden State Killer and some people obviously were. It bears repeating here that no one’s DNA or results were “handed over,” and the only people affected in any way was someone that matched DeAngelo, and probably then only the closest matches. Many time people’s trees are utilized and their cousins never contact them, so it’s certainly possible that people who match DeAngelo have no idea still to this day.

The usage evolution for GedMatch from genealogy to other functions has been a slippery slope, although clearly no one realized at the time, when several years ago uploads began with modified ancient sample kits. Later, people began to use the GedMatch database (among others) to identify biological parents, then victims and now criminals.

Other people feel that searching for parents is genealogy, but identifying criminals is not – even though the search techniques are exactly the same. In our rape example, the mother who was a victim was identified and the criminal rapist father was identified as well by the same DNA test. The tester’s intent was only to reveal their biological parents – hoping for a loving, tear-filled reunion. That’s not what happened. The process of finding their parents also revealed the associated circumstances.

You can’t separate these usages into separate “boxes” anymore, because they overlap in unexpected says. That rape case wasn’t hypothetical.

I have absolutely no sympathy for the rapist, in fact, quite the opposite – but I feel incredibly bad for the young mother who has now been twice victimized. First by the rapist and second by the process used to track her, through relatives who began asking lots of difficult questions.

Last fall, in a Facebook group I follow, I was utterly horrified to see someone post that in the adoption cases she works, she encourages the adoptee, when they feel they are “close” to identifying a parent, to send registered letters to all of the family members, asking them to test, hoping that those who aren’t the parent quickly test to absolve themselves and as a way to flush the parent out.

It’s Not Just Your DNA

In either case, the DNA of the RELATIVES of the person being sought, be it a parent, a victim or a criminal, is what is used to find or identify the desired person. People who have uploaded to GedMatch are now concerned that they might be that relative whose DNA is used in a way they did not originally anticipate. They are right, and not just about this particular criminal case – but about the many types of usages other than strictly genealogical that looks backwards in time.

Perhaps the people who uploaded never thought about the fact that their DNA is/was being used for adoption or missing parent searches – or perhaps they are supportive of that activity. Maybe they thought that identifying victims, such as Buckskin Girl was a great use of the data base by investigators. Maybe they never thought about the fact that searching for criminals who leave DNA specimens behind uses exactly the same research and matching techniques as adoptees’ parent searched.  Perhaps no one stopped to think  that the same search can identify both parents, a victim and a criminal at the same time.

Maybe they were naïve and never thought about it at all or didn’t read the GedMatch statement that said (and says):

In today’s world, there are real dangers of identity theft, credit fraud, etc. We try to strike a balance between these conflicting realities and the need to share information with other users. In the end, if you require absolute privacy and security, we must ask that you do not upload your data to GEDmatch. If you already have it here, please delete it.

I can’t tell you how many of the posts and e-mails I’ve seen about this topic include the word “assume,” and we all know about assume, right?

Maybe, like me, some people have thought about that potential situation and want criminals, regardless of whether they are relatives or not off the streets. If they are relatives, so much the better, keeping my own family safer.

Some people may have been uploading their relatives’ DNA samples to GedMatch or any other site other than where the relative originally tested without the relative’s permission. If that’s the case, the person either needs to obtain permission, pronto, or delete the person’s DNA they uploaded without permission.

GedMatch’s Statement

GedMatch has posted the following statement.

Testing in the Future

Another concern voiced this week is that people, especially relatives that we want to test, will be much more reticent to test in the future if they think the police can “take” or “access” their DNA. That’s probably true, so we need to be prepared to explain what actually happened, and how, to eliminate misconceptions

However, it is true that DNA in these databases has been and is being used for things other than genealogy. This is also the purpose of informed consent – with an emphasis on informed. Bottom line – the cat’s out of the bag now. Perhaps these incidents together, meaning parent searches, the identification of Buckskin Girl and the arrest of the Golden State Killer, will bring home the warning that was previously noted on GedMatch.

If you’re not comfortable – don’t upload. This also means that people MUST STOP simply telling other people to upload to GedMatch as a cure-all for everything that ails genealogists. If you are making the recommendation, you also bear the responsibility for full disclosure or at least a caveat statement.

“GedMatch is great for genealogy matching to each other across vendor platforms <or words of your choosing>. It’s also used for adoptees searching for their parents, was used to identify Buckskin Girl and played an important role in the apprehension of the Golden State Killer.”

As a result, GedMatch now provides a way to remove your entire account, if you so wish. GedMatch needed to do that for GDPR anyway. As long as we are on the topic, GDPR, which goes into effect on May 25th tightens privacy significantly for any vendor or company that includes records of any UK/EU resident. You can read about that in my articles here and here.

Every (major) testing company, along with GedMatch provides the option of removing your DNA results if you are so inclined.

As for people being hesitant to test, certainly some already were and some will be. But there will also be others that only first heard about genetic genealogy this past week and this notoriety won’t deter them one bit. Some people will actively choose to participate, knowing that they can later change their mind if they so choose. I notice the GedMatch site has been busier than ever.

In summary, the police did not “take” or even ask for anyone’s DNA. They simply uploaded the DNA results of a criminal, taken from the crime scene, and looked at the matches generated in order to make an ID, at which time they obtained the DNA of the suspect which matched the DNA from the crime scene.

Just like genetic genealogy, DNA without supporting evidence won’t be much good, but now they have someone identified to work with, collecting other evidence. Where was he? Does the DNA at multiple scenes match his? I would think in terms of a prosecution that these matches and arrest is only the beginning, not the end of the process.

Given that none of the major genealogy companies cooperate with law enforcement without a search warrant, it’s a WHOLE LOT easier to obtain your discarded DNA than to obtain a search warrant. Furthermore, there is no chain of custody with DNA from a genealogy data base, but there certainly is from a rape and from a discarded cup. If the DNA of the criminal from the scene, and the suspect’s DNA from a discarded item match as the same person, that’s pretty conclusive and damning evidence.

Of course, fear begets fear and the old questions of government access and other issues bubble up again.

Another question I’ve received is about whether the usage of GedMatch for the Golden State Killer case opens the door for DNA to be obtained by insurance companies. First, you’d have to test and upload something. There is nothing to “get” if you don’t – and the insurance company would need a search warrant (and probable cause of a crime) to retrieve your DNA from any testing company.

GINA legislation protects American’s today from discrimination when obtaining health insurance, but it doesn’t extend to life and other types of insurance. However, when I applied for life insurance some years ago, they took a blood sample and if I wanted life insurance, I had to authorize whatever it was they wanted to test in that sample. I’d wager that today, they would run a DNA test in addition to checking for other health indicators. No GedMatch or testing company is needed or desired – in fact – an insurance company requires chain of custody which is why they send someone to your house to draw your blood.

What To Do?

What you do with your DNA sample is entirely up to you. Everyone will make their own decision based on their own circumstances and preferences.

Some people have removed their DNA from the various databases and in essence, have stopped participating in genetic genealogy.

Some have made their kits at GedMatch either research or private. Research means that you can run the kit and see matches, but others can’t see you. That certainly defeats the spirit of collaborative genealogy.

Some people have evaluated the evidence at hand and have made the decision to continue as normal – just more aware of other uses that can, have and may occur.

This story and others similar will continue to arise and unravel, and many questions will likely be asked and hotly debated over the next many months and years, both within and outside of this community. I would not be surprised to see legislation of some type follow – which has been one of the biggest fears within the genetic genealogy community for years. Legislation by people unfamiliar with the topic at hand will likely be overreaching and extremely restrictive. Let’s hope I’m wrong.

Like many others, I’m concerned that the genetic genealogy field will become a victim of its own success. I hope that doesn’t happen, but at this point, the cow has left the barn and that door really can’t be effectively shut. All we can do is to be transparent, make informed choices, assure that we have the consent of anyone whose kit we manage and to advocate for sanity.

My Decision

I’ve made my personal decision and my thought process worked like this:

  1. I haven’t done anything that I need to worry about.
  2. If a family member does something they need to be arrested for, I hope my DNA helps.
  3. If I were the family of the victims, I would want them identified AND their murderer/rapist put away forever. (Disclosure, I have had a family member raped and a different family member murdered.)
  4. As a citizen, I want criminals such as rapists and murderers identified and removed from society through any legal means possible.
  5. DNA testing also exonerates people who were wrongfully convicted through advocacy groups like the Innocence Project.
  6. DNA eliminates potential criminal candidates as well as pointing the finger directly at others.
  7. Using the techniques utilized for unknown parent searches, an identification is seldom made as a result of ONE match only, unless it’s immediate family. Therefore, if you remove your own DNA from the data base(s) for matching, your cousin and their cousins are still there – so your criminal family member’s goose is still cooked. It might just take a little longer in the stew pot.

My DNA stays online and I continue to support all of the major DNA testing companies that provide matching and accept transfers, including GedMatch.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

DNAPainter – Mining Vendor Matches to Paint Your Chromosomes

This isn’t quite the same as when my mother used to talk about painting the town, but in genetic genealogy terms, it’s better.

This is the second of 4 articles that will describe how to use DNA Painter.

Today, I’d like to talk about how I utilize the various vendor testing tools combined with DNAPainter to “mine my DNA,” or better put, to mine my ancestor’s DNA which is now mine, pun intended.

To review instructions for how to set up and use the DNA Painter tool, please read DNA Painter – Chromosome Sudoku for Genetic Genealogy Addicts and then come back here to proceed.

I’m going to discuss each vendor’s tools and how I’ve used them, sometimes in combination.

57% Painted

Please note that you can click on any image to enlarge

Is this not a beautiful thing to behold? That’s my ancestors, in loving color, looking back at me, on MY chromosomes.

I’m completely thrilled that I have managed to paint 57% of my chromosomes. I’m a visual person, and while I’ve worked with spreadsheets now for years, I’ve officially abandoned them. Ok, mostly.

Yes, you heard me right – I’ve abandoned the spreadsheets in favor of DNA Painter, at least for segments where I can positively identify an ancestral couple. In other words, those segments that can be reliably mapped.

That 57% is made up of 445 segments in total, split between my maternal and paternal sides. That’s without counting my mother’s DNA. While I do utilize matching to my mother in order to be sure that a match is really a valid match, I didn’t paint her DNA. Obviously, I’m going to match her 100%, and DNA painter already breaks chromosomes into my pink maternal and blue paternal sides.

Key Elements

  1. The single best thing you can do in order to paint your chromosomes is to have known family members and cousins test. You can then paint their DNA that matches yours, attributing it to their identified family line.
  2. The second best thing you can do is to work with your matches using their trees to identify your common ancestor.

Now, you’re ready to begin painting.

I’m going to step through the process I used at each vendor to identify paintable segments.

I did not paint segments that I could not identify to an ancestral line, except for my endogamous Acadian line which I labeled simply as Acadian to mark those segments that I can identify as Acadian, but I can’t identify a specific ancestor, or ancestors. When I can identify the Acadian ancestor, I paint that segment using the ancestors’ names.

Family Tree DNA

At Family Tree DNA, I begin with my closest matches that are not immediate family – meaning not my parents, children or grandchildren. I’m looking for aunts, uncles, cousins, etc. I don’t paint siblings, but often half siblings are extremely useful because they can help you identify which paternal side other matches are related to.

In the first DNA Painter article, I explained how to utilize the Family Tree DNA chromosome browser to select an individual whose matching DNA can be displayed so that you can copy and paste that segment into the painting feature of DNA Painter.

On your results page, your “bucketed individuals” who have been assigned as maternal (pink icon above) or paternal (blue icon not shown) can be a huge clue when used in conjunction with the in-common-with (ICW) tool and the matrix.

You can also search by ancestral surname and then evaluate each match through common surnames, trees and other resources. If you’re not familiar with how to use the tools at Family Tree DNA, here’s a quick run-through.

Select the individual whose DNA you wish to paint, view in the chromosome browser, then copy and paste from the grid below to the DNAPainter tool.

I painted the matching DNA of all the people whose common ancestor with me I could positively identify before moving on to the next vendor.

Who Have I Painted?

As you begin to paint segments from multiple vendors, you may wonder if you’re finding duplicates. It’s easy to tell. At DNA Painter, click on “All segment data,” below the legend in the bottom right corner.

This displays the entire list of matches whose DNA you have painted, in spreadsheet format. You can sort by match name or simply do a browser search. (CTRL+F)

You can also download this data into a cvs (Excel compatible) file at the top left of this page.

Avoiding Duplicates

As you view and paint your matches at the various vendors, you may discover that you have already found a match with that person at another vendor, either because they tested there or uploaded their autosomal file. When possible, avoid duplicate painting. It won’t help anything and will just clutter your chromosomes. You may not always be able to identify a match as a duplicate, especially if the tester utilizes a pseudonym at various locations. Don’t’ worry though, because you can always easily delete it later and a duplicate person/segment certainly won’t hurt anything.

Ok, now to our next vendor! Let’s find more segments to paint.

MyHeritage

At MyHeritage, click on DNA matches.

At the right of the search box, fly over the little pink key (or funnel) looking thing and you’ll see the option for “Has Smart Matches.” That’s what you’re looking for.

Click on the key icon.

Smart Matches mean that your DNA matches and you have a common ancestor in your trees. Click on the purple button to review this DNA match.

For each match, scroll all the way down to the bottom where your matching chromosome segments will be colored.

At the right, above the chromosome browser, click on “advanced options” which will allow you to select “download shared DNA info.” You need to download to your system so that you can copy and paste the matching segment information to DNA Painter.

MyHeritage has a few more columns than necessary, and DNA Painter can’t utilize them. Delete the columns for Name, Match Name, RSID beginning and end, and also eliminate SNPs due to an overestimation issue. In many cases, the SNPs at MyHeritage are twice or more than the number of SNPs when comparing the same segment at other vendors.

Now that your segment is cleaned up, copy the entire group shown above, minus the yellow columns which you’ve deleted, and paste into the DNA Painter spreadsheet.

MyHeritage has recently added a triangulation feature, shown at the far right, below, indicating that these two people individually triangulate with me and Alberta. The icon at far right of “5th cousin” indicates triangulation.

By clicking on the triangulation icon, you then see how that person triangulates with both your match and you – in this case, me, Alberta, and Chandler.

You may choose to paint triangulated segments, BUT, the size of the triangulated segment is often going to be smaller than the amount of DNA than you match individually to either one or both people.

In the example above, you can see that you match the pink person on a significantly longer segment than you match the tan person. The amount of DNA where you match both the pink and tan person is smaller yet, because the area where you match the tan person extends beyond where you match the pink person and vice versa. If you were going to paint ONLY the triangulated segments, you would paint only the portion that is both pink and tan, “boxed” above.

I don’t recommend painting ONLY triangulated segments, because you’ll be depriving yourself of the ability for each person to match others on the portions of the segments on which they match you, but not the other person in question.

In this example, utilizing DNA Painter, you’ll see that people in fact match you AND the pink person on several segments. The segment shown in pink, at MyHeritage, above, is shown on chromosome 5 in DNA Painter as the long mustard colored segment. Look at how many people match you on that segment. This is why we don’t paint only the triangulated portions of the chromosome. That long mustard segment match will triangulate with many people on smaller portions of that mustard segment, as evidenced by the yellow, grey, blue, cinnamon, purple and red segment matches..

DNA Painter helps you triangulate, so there is no reason to restrict your painting to triangulated segments.

Triangulation is a great tool, but don’t mix triangulated segments with matching segments in the same profile, at least not until you get the hang of the tool and using the multiple vendor’s results.

23andMe

Unfortunately, 23andMe doesn’t have tools like tree matching (MyHeritage) or maternal/paternal phasing (Family Tree DNA,) but they do allow testers to enter common surnames.

Looking at closer matches, meaning first, second or third cousins, if they list even a few surnames, you may well be able to identify the common genealogical line, especially in conjunction with ancestral locations and the other people you match in common.

Sometimes you can glean enough information to identify your common ancestor. In this case, even if I didn’t know Cheryl, the surname would have identified the ancestor. If that didn’t do it, the “in common” list below would!

Once you’ve identified the common ancestor and decide you’re ready to paint, click on the Tools tab at the top of your page and select DNA Relatives.

On the DNA Relatives tab, click on the relative whose DNA you wish to paint. I’m selecting my cousin, Cheryl.

Click on the blue DNA Comparison, in the upper right hand corner.

On the comparison screen, you will select yourself as one person and Cheryl as the other.

At the top you’ll see the two individuals and their overlapping segments painted onto chromosomes. Scroll down and you’ll see the segment detail, below.

Highlight the rows (they’ll turn blue, like above) and right click to copy the segment information.

The next step is to drop the results into a spreadsheet, just long enough to delete the first and last columns, shown in red below, then copy the remaining rows and paste into the DNA Painter tool.

Mining Ancestry Data at GedMatch

GedMatch is somewhat of a special case, because GedMatch doesn’t do DNA testing, but provides an open sharing platform by facilitating uploads of raw autosomal files from multiple other vendors. Therefore, anyone with results at GedMatch tested elsewhere. If you tested at all of the other vendors, it’s probable that you find people at GedMatch as a match that match you at other vendors too.

Because 23andMe does not support the uploading of Gedcom files, if your match has uploaded a Gedcom file to GedMatch, or connected to Geni or WikiTree, then you may be able to identify your common ancestor at GedMatch that you were not able to identify at 23andMe.

Conversely, if you match at Ancestry, you won’t be able to paint from Ancestry, because Ancestry does not provide segment information. We will talk about Ancestry as a special case next, but for now, let’s focus on how to utilize GedMatch.

At GedMatch, you’ll work in steps after setting your account up and uploading your raw data file from either:

If you tested elsewhere, or after August of 2017 at 23andMe, you will have to upload to a special section called GedMatch Genesis. GedMatch Genesis provides a sandbox area for files other than the ones listed above that are generally incompatible with those files and with each other. Genesis files often have few SNP locations in common and not enough to match reliably.

I do not recommend DNA painting utilizing segments from GedMatch Genesis.

GedMatch is currently merging their regular GedMatch service with the Genesis service, so I’m not entirely clear how you will tell the difference between the kits known to match reliably, mentioned above, and others after the merge.

Currently, kits with T prefix (Family Tree DNA), A (Ancestry) and M (23andMe) show version levels in the type field when you match in regular GedMatch. MyHeritage kits are processed by the Family Tree DNA lab. G kits used a generic upload, so you can’t tell where they originated.

Kits uploaded in the Genesis sandbox seem to be assigned double alpha letter kit prefixes at random. Genesis includes a “Testing Company” field which does not include a version number. Today, just stay with the regular GedMatch one-to many and one-to-one matching for DNA Painter.

First, you’ll want to perform a one-to-many match.

This page shows your closest 2000 results. In my case, truncating my matches at 12.7cM. This means if I want to see my results below 12.7 cM, I must subscribe to the Tier 1 Utilities in order to be able to display over 2000 matches.

We’ll discuss how to utilize Tier 1 matching in the Ancestry portion, next, but for now, we’ll just be working with the regular one-to-many matches report.

Of course, trusty cousin Cheryl has results here as well.

In order to compare Cheryl’s results to my own, I need to do two separate things:

  • Click on the A link under the Autosomal Details column (above) and/or
  • Click on the X link under the X DNA column

These two results, both of which are paintable, do not display together so must be selected separately.

By clicking on the A or X, GedMatch will display a one-to-one comparison. I leave this page (below) at the default values and simply click submit.

Your next screen will be a match grid.

Once again, select and copy the results, then paste into DNA Painter. If you also have an X match with this individual, return to the one-to-many match page and then click on the X link to repeat the same process for the X chromosome.

Ancestry Through GedMatch

As far as I’m concerned, the best thing about Ancestry matches is DNA shared ancestor hints (SAH) – meaning those green leaves visible near the green “view match” button which indicate that you share both DNA and a common ancestor(s) in your trees.

Followed immediately by the worst thing which is that Ancestry provides no segment data. However, pairing Ancestry with GedMatch can provide you with some segment information, although you do have to dig. That digging was certainly worthwhile for me, as I found several readily identifiable matches.

When I find a green leaf shared ancestor hint at Ancestry, I record as much information about that match as I can in a spreadsheet. The reason is twofold.

  • Ancestry hints tend to come and go, rather inexplicable, and I want to have that information someplace besides at Ancestry
  • I want to be able to view how many matches I have through specific ancestors which I can do in a spreadsheet by sorting.
  • I want to be able to mine GedMatch for segment information for people at Ancestry who have uploaded to GedMatch.

Note the RJE V2 results, a 6th cousin who I match at 6.6 cM, as we’ll be using that at GedMatch.

I maintain several columns in my Ancestry Match spreadsheet, as shown above. I track people who might be good Y or mitochondrial DNA candidates, as well as GedMatch numbers or other useful information.

I don’t utilize segments smaller than 7 cM for DNA Painter, BUT, Ancestry almost always under-reports the matching segment size due to their internal process which removes some segments that do match. Therefore, I search for all Ancestry matches in GedMatch and paint them if they are 7cM or over at GedMatch. You will match at Ancestry down to 6 cM. Since 7cM is the default GedMatch threshold, that works out well. I don’t find them if they are under 7cM at GedMatch, and I don’t care.

In my case to obtain segments smaller than 12.7 cM, because that is the cutoff where the free one-to-many GedMatch tool reaches the 2000 match threshold (for me,) I need to utilize the Tier 1 subscription utilities which are well worth every dollar.

The one-to-many match looks quite different for the Tier 1 tool.

You’ll need to play with this a bit to determine how high you need to set the limit to see all of your 7cM matches. In my case, I had to set it to 20,000.

I utilize two monitors, so I display my Ancestry spreadsheet on the first monitor and the GedMatch one-to-many match table on the second monitor.

Then, utilizing the browser’s search function, I search for any identifiable portion of the information for the Ancestry match at GedMatch.

In the first example, the user’s name is RJE V2. I search at GedMatch for “RJE” using “ctrl+F” which is the browser’s find function.

You can see that the search found a total of 3 different “RJE” entries. Looking at the first 2, you can see that one is labeled V4 and one is labeled V2. Typically, I would look at this and decide that the RJE V2 is the right match based on the user name at Ancestry.

However, look closer.

The RJE V2 at GedMatch has a much higher amount of shared DNA at 3587.1 cM total than the RJE V2 at Ancestry with a total of 6.6 cM. Clearly, this is not the same person, even though the user name is the same.

For all we know, a different person may have used the same user name, which is clearly an alias, noted by the “*”. Or the same person may have multiple kits at GedMatch.

However, in this case, the RJE V2 is not the same match.

However, let’s say that it is the same person and we’ve been able to reasonably identify the match. In order to compare one-to-one, click on the highlighted blue “largest segment” in the autosomal category, shown below.

If you want to compare the X one-to-one, click on the blue largest segment in that column.

From this point, the matching will look the same as the one-to-one GedMatch matching shown in the previous section – so copy and paste as normal.

While this certainly isn’t the most effective way of working with Ancestry matches, it’s really the only hope we have, unless your match has also uploaded to either Family Tree DNA or MyHeritage.

However, in my experience, I generally stand a better chance of identifying Ancestry matches at GedMatch because their user name or the user name of the person managing their account can be found much more readily. People sometimes tend to utilize the same abbreviations, names or nicknames in multiple locations.

Summary

While each vendor has unique strengths and weaknesses today, and GedMatch provides a platform used by some but not all, the best way to effectively paint your chromosomes is to utilize all of the tools available, and sometimes together. I strongly suggest that you test at or upload to each vendor, because you will find matches at each vendor that aren’t elsewhere.

How many segments can you paint on your chromosomes, and what will those segments tell you?

In the next article, I’ll be walking through my chromosome painting gallery to take a look at the hidden messages there! I hope you’ll come along so you can find some hidden messages of your own.

Enjoy!

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

DNA Painter – Chromosome Sudoku for Genetic Genealogy Addicts

Not long ago, Jonny Perl introduced the free online tool, DNA Painter, designed to paint your chromosomes. I didn’t get around to trying this right away, but had I realized just how much fun I would have, I would have started sooner.

Fittingly, Jonny, pictured above, won the RootsTech Innovation award this year for DNA Painter – and I must say, it’s quite well-deserved.

Congratulations Jonny!

  • This is the first of four articles about DNA Painter. In this article, we’ll talk about how to use the tool, and how to get started.
  • The second article talks about mining your matches at the various vendors for paintable segments with instructions for how to do that accurately with each vendor.
  • In the third article, we’ll walk through an analysis of my painted segments, so you can too – and know how to spot revelations.
  • The fourth article explains how I solved a long-standing mystery that was driving me crazy. If you have a relatively close mystery person in your DNA match list that you can’t figure out quite where they fit, this article is written just for you!

I’ll tell you right now, I haven’t had this much fun in a long time!

Want to hear the best part? You don’t have to triangulate. DNA painting is “self-triangulating.” Yes, really!

Let’s get started!

Introducing DNA Painter

To begin to use DNA Painter, you’ll need to set up a free account at www.dnapainter.com.

Read the instructions and create your profile.

Jonny provides an overview.  Don’t get so excited that you skip this, or you won’t know how to paint correctly. You don’t need to be Picasso, but taking a few minutes up front will save you mistakes and frustration later.

Blaine Bettinger recorded a YouTube video discussing how to use DNA Painter to paint your chromosomes to identify and attribute particular segments to specific ancestors. It includes a mini-lesson on chromosome matching.

I strongly suggest you take time to watch Blaine’s video from the beginning. For some reason, this link drops into the video near the end, but just slide the red bar back to the beginning.

Get Started

Here’s my blank, naked chromosomes. Notice for every chromosome, you see a blue paternal “half” and a pink maternal “half.” That’s because everyone gets half of their autosomal DNA from their father, and the other half from their mother.

Looking at my own chromosome painting today, below, it’s incredibly exciting for me to see 57% of my DNA painted, attributed to 77 couples and one endogamous group, Acadians. This took me a month or so working off and on.

At the end of the day, this is often how I rewarded myself! The only problem it that it has been difficult to go to bed.

Comparatively, I’ve been working on my DNA match spreadsheet, attributing segments to ancestors now for 5 or 6 years, and I’ve never been able to see this information visually like this before. This view of my ancestrally painted chromosomes is so rewarding!

Who To Map

DNA Painter is not the kind of tool where you upload your results, it’s a tool where you selectively paint specific segments of matches – meaning segments on which you match particular people with known common ancestors.

How do you know who is a good candidate to map?

I began with painting my closest matches with whom I could identify the common ancestor.

Not only will painting your largest matches be rewarding as you harvest low-hanging-fruit, it will help you determine if you actually have identified the correct DNA for later matches being attributed to a specific genealogical line. In other words, mapping these larger known segments will help you identify false positives when you have no other yardstick.

Your First Painting

I’m opening a new profile in DNA Painter to demonstrate the steps in painting along with hints that I’ve learned along the way.

I’m going to utilize my cousin, Cheryl, whom I match closely at Family Tree DNA. If you don’t know how to use the Family Tree DNA autosomal tools, click here.

Cheryl is my first cousin once removed, so we share a significant amount of DNA.

I’ve selected Cheryl on my match list, checked her match box, and then clicked on the Chromosome Browser in order to view our segment matching information.

You can see on the chromosome browser that I share quite a bit of DNA with Cheryl.

At the top of the chromosome browser, click on “View this data in a table.”

Highlight and copy all of the segments for Cheryl. I only use 7cM segments or higher at DNA Painter, so you don’t have to copy the data in the rows below your last match at that level. DNA Painter takes care of stripping out all the extraneous stuff.

Paint a New Match

At DNA Painter, after you have your profile set up, click on “Paint a New Match.”

Simply paste the segment data into the box in the window that pops up. DNA Painter takes care of removing the header information as well as segments that are too small.

You can click on “overlay these segments” to “test” a fit, but I haven’t really found a good use for that, because I’m only painting segments I’m confident about and I know which side, maternal or paternal, the match is on based on the known relative.

Click on “save match now” in the bottom right corner.

In the Save Match popup, shown above, I utilize the fields as follows.

I enter the name of my DNA match, followed by their relationship to me, followed by the source of the match. In this case, “Cheryl <lastname>, 1C1R, FTDNA”

In the “Segment/Match Notes” I list how the match descends from the common ancestral couple, a GedMatch ID if known, and anything else pertinent including other potential ancestral lines in common. This means that I list every generation beginning with the common ancestral couple and ending with the tester.

Hiram Ferverda and Eva Miller, Roscoe, Cheryl, GedMatch Txxxxxx

You’ll wind up eventually rethinking some of your segment assignments to particular ancestors and you’ll want as much information here about this match as possible.

Moving to the next field, in the “Ancestors Name,” I utilize the couples name, because at this point, you can’t tell which of the two people actually contributed the DNA segment, or if part is from one ancestor of the couple and part is from the other. If the male ancestor is a Sr. or Jr., or is otherwise difficult to tell apart from your other ancestors, I suggest entering a birth year by his name. This is your selection list for later painting segments from the same ancestor, so you want to be sure you can tell the generations apart.

Next, you’ll select the maternal or paternal side of your family. Change the color if you don’t like the one pre-selected to assign to segments descending from that couple. Originally, I was going to have pinks or light colors for maternal, and blues or darker for paternal, but I quickly discovered that scheme didn’t work well, and I had more ancestors than I could ever have imagined whose DNA I am be able to map and paint.

Therefore, pick contrasting colors. You can use each color on each half, meaning maternal and paternal, since the segments will be painted on different halves of the chromosome.

In the “Notes for This Group,” I add more information for the couple such as birth and death dates and location if I know or am likely to forget.

Click “save.”

Here you go!  Isn’t this fun!!!! Cheryl’s segments that match mine are painted onto my chromosomes!

At the right, your ancestor key appears with each ancestor to whom you’ve assigned a color key.

So far, I only have one!

Want to paint another group of segments?

Let’s paint Cheryl’s brother.

Following the same sequence, I paint Donald’s DNA, but this time, I select “Or link these segments to an ancestor I’ve added before.”

I select Hiram Ferverda, Eva Miller and save. The segments that I have in common with Cheryl and/or Don will now be displayed on each chromosome.

Looking at chromosome 1, you can see that I match Cheryl and Don on the same segment at the beginning of the chromosome, but received two different segments of DNA on a different portion of chromosome 1, further to the right.

As one last example, I added the DNA from two known cousins, Rex and Maxine, who descend a couple generations further back in time through more distant ancestors in the same line – one maternal and one paternal.

Click on the chromosome number to expand to see all of the painted segments

You can see, looking at chromosome 3 that Cheryl and Don match me on a significant amount of the same large pink segment plus a smaller pink segment at the end

Rex (yellow) and Maxine (blue) both match me on different parts of the chromosome. It looks like there is a small amount of overlap between Rex and Maxine which is certainly feasible, because Jacob Lentz, the ancestor that Maxine descends from is ancestral to the couple that Rex descends from.

By utilizing known matches, and mapping, we can see segments that move us back in time, telling us from which ancestor that portion of the segment descends.

For example, if the blue segment was directly aligned with one of the pink segments, then we would know that the blue portion of the pink segment descended from Jacob Lentz and Fredericka Reuhl.

This is the most awesome, extremely addictive game of ancestor Sukoku ever.

Wanna play???

Here’s how to prepare for my next article where we’ll utilize the various vendor matches to begin painting.

Download and Upload Your Autosomal Files

You’ll want to have your DNA at the most vendor locations possible so you can find all your matches that can be attributed to known relatives and ancestors. You never know who is going to test at which vendor, and the only way to find out is to have your DNA there too.

For each vendor, I’ve provided a mini-tutorial on how to maximize your testing and transfers both monetarily and for maximum matching effect, or you can read an article here that explains more.

There’s also a cheat sheet for transfer strategies at the end of this article.

A technique called imputation is mentioned below, so you may want to read about imputation here. MyHeritage’s initial offering utilizing imputation was problem plagued but has since improved significantly.

Ancestry

To Ancestry – There’s no way to transfer files TO Ancestry, so you’ll need to test there to be in their database. You will also need at least a minimum subscription ($49) to utilize all of the Ancestry DNA features. You can see a with and without subscription feature comparison chart here.

From Ancestry – There is also no chromosome browser at Ancestry. In order to use DNA Painter, chromosome segment information is required, so if you test at Ancestry and want to paint your segments, you’ll need to download your DNA file to either or all of:

  • Family Tree DNA – partially compatible with the current Ancestry test chip format – uses imputation to infer additional genetic regions
  • MyHeritage – partially compatible, but uses imputation to infer additional genetic regions
  • GedMatch

Family Tree DNA

To Family Tree DNA – You can upload the following vendor files TO Family Tree DNA.  Matching is free, but to use the advanced tools, including ethnicity and the chromosome browser, you’ll need to pay the $19 unlock fee. That’s still significantly less than retesting, especially for files that are 100% compatible.

  • Ancestry – V1 files generated from before May 2016 are entirely compatible, V2 files from after May 2016 are partially compatible, providing between 20-25% of your matches, meaning your closest matches
  • 23andMe – V3 file from Dec 2010-Nov 2013 and V4 file from November 2013-August 2017 are compatible, the V5 platform file beginning in August 2017 is not compatible
  • MyHeritage – fully compatible

From Family Tree DNA – You can upload your Family Finder results to:

MyHeritage

To MyHeritage – You can upload the following files to MyHeritage:

  • Family Tree DNA – fully compatible
  • Ancestry – partially compatible but uses imputation to infer additional genetic regions
  • 23andMe – partially compatible but uses imputation to infer additional genetic regions

From MyHeritage – If you test at MyHeritage, you can upload your files to:

23andMe

To 23andMe – You cannot transfer TO 23andMe, so you’ll need to test there if you want to be in their database.

From 23andMe – If you tested at 23andMe, you can upload your files to the following vendors:

  • Family Tree DNA – V3 file from Dec 2010-Nov 2013 and V4 file from November 2013-August 2017 are compatible, the V5 chip beginning in August 2017 is not compatible
  • MyHeritage – 23andMe – partially compatible but uses imputation to infer additional genetic regions
  • GedMatch – V3 file from Dec 2010-Nov 2013 and V4 file from November 2013-August 2017 are compatible, the V5 chip beginning in August 2017 is only compatible in the Genesis sandbox area. V5 matching is not reliable. Files from other vendors are recommended for GedMatch unless you are matching against another V5 result.

GedMatch

GedMatch is a third-party site that accepts all of these vendors’ autosomal files, with a caveat that the 23andMe V5 kit matches very poorly and requires special handling. I don’t recommend using that kit at GedMatch unless you are matching against other 23andMe V5 kits.

I upload multiple kits to GedMatch and mark all but one for research only. This allows me to use my Ancestry kit to match with other Ancestry users for more accurate matches, my Family Tree DNA kit to other Family Tree DNA kits, and so forth. Not marking multiple kits for research means that you’ll appear more than once on other people’s match lists, and only your first 2000 matches are free. Marking all kits except one as research is a courtesy to others.

Recommended Testing Strategy for New Testers

  1. Test at Ancestry and download to GedMatch.
  2. Test at Family Tree DNA and upload to MyHeritage and GedMatch.
  3. Test at 23andMe and upload to GedMatch Genesis.
  4. At GedMatch, mark all except one kit as “research,” then utilize your kits from the same vendor for one-to-one comparisons.

Recommended Transfer Strategy

Of course, where you have, and haven’t already tested will impact your transfer strategy decision. I’ve prepared the following cheat sheet to be used in combination with the information discussed above.

*Unless you can transfer a 23andMe V3/V4 or an Ancestry V1 kit to Family Tree DNA, it’s better to test at Family Tree DNA. Ancestry V2 tests are only 20-25% compatible.

A transfer from Family Tree DNA to MyHeritage is best because those vendors are on the same platform and the tools at MyHeritage are free.

In my next article, we’ll discuss how to mine your matches at the various vendors to obtain accurate segments for chromosome painting – including a strategy for how to utilize Ancestry and Gedmatch together to identify at least some Ancestry segment matches.

So, for now, get ready by transferring your matches into whichever data bases they aren’t already in. The only data base where I couldn’t identify matches that I didn’t have elsewhere was at 23andMe. The rest were all there just waiting to be harvested!

_____________________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Who Tests the X Chromosome?

Recently, someone asked which of the major DNA testing companies test the X chromosome and which ones use the X in matching. How does this difference influence the quality of our matches?

Vendor X in Download File Uses X in Matching X Included in Total cM Count
23andMe Yes Yes Yes
Family Tree DNA Yes Yes (if have a match on another chromosome) No
Ancestry Yes *No No
MyHeritage Yes No No
GedMatch N/A Separately No

*If Ancestry did utilize the X in matching, it wouldn’t benefit customers because Ancestry does not show segment information by chromosome.  In other words, no chromosome browser.

Family Tree DNA includes any size X match IF and only if the two people already match on a different chromosome.

GedMatch, of course, isn’t a vendor who does DNA testing, so they don’t provide download files.  They are solely on the receiving end.

X CentiMorgan Counts

Due to variations in the way vendors calculate matches and total cM counts, your mileage may vary a bit.

In other words, the 23andMe cM total, if an X match is involved, may be slightly more than a match between the same two people at Family Tree DNA, where the X match cM is not included in the cM total.

Conversely, you won’t show an X match with someone at Family Tree DNA if there isn’t also another segment on a different chromosome that matches.

In general, due to the thin spread of SNPs on the X chromosome, you will need, on average, a cM match that is twice as large as on other chromosomes to be considered of equal weight.

In other words, a 10 cM match on the X chromosome would only be genealogically equivalent to approximately a 5 cM match on any other chromosome.

X matches really can’t be evaluated by the same rules as other chromosomes due both to their SNP paucity and their inheritance path, which is why most vendors don’t include those segments in the total cM count.

X Matches

While including the X chromosome cM count is problematic, X matching can be a huge benefit because of the unique inheritance path of the X chromosome.

In the article, X Marks the Spot, we discussed the inheritance path of the X chromosome for both males and females. Females inherit an X chromosome from both father and mother, which recombines just like chromosomes 1-22.  However, men only inherit an X from their mother, because they inherit a Y from their father instead of the X.  Therefore, males will only inherit an X from their mother, and females will only inherit their father’s mother’s X chromosome.

Charting Companion software works with your genealogy software of choice to produce a lovely fan chart where the contributors of my X chromosome are charted in color, above. You can read more about Charting Companion here.

The great news is that if you and a match share a significant portion of the X chromosome, meaning more than 15 cM which reduces the likelihood of an identical by chance match, the common ancestor (on that segment) has to come from an ancestor in your direct X path.

I’m always excited to see with whom I share an X.  That piece of information alone helps me focus my ancestor detective efforts on a specific portion of my tree.

Some X segments can remain intact for generations and may be very old.  So don’t be surprised if the common ancestor of the X segment and another matching segment may not be the same ancestor.

Sorting by X

I wasn’t able to find a way to sort by X chromosome matches at 23andMe, but you can sort by the X at both Family Tree DNA and GedMatch.

At GedMatch, X matching shows on the one-to-many match page.  You can sort by either Total X cM or Largest X cM by using the up and down arrows, at right, below, in the X DNA columns.

After you identify an X match, be sure to run the X one-to-one match option to verify.

My GedMatch matches cause me to wonder if 23andMe is using a different reporting threshold for the X chromosome, because one of my matches at GedMatch is a close family member with no X match at 23andMe, but a total of 32 X cM and with a longest segment of 14 X cM at GedMatch.

That same individual matches me with the largest X segment of 14 cM at Family Tree DNA as well.

Family Tree DNA X Match Phasing

At Family Tree DNA, on your Family Finder matches page, just click on the X-Match header (at right, below) to bring all of your X matches to the top of your list.

If you have linked any kits of relatives to your tree, you will see numbers of phased kits on the maternal and paternal tabs with the red and blue male and female icons. In the example above, I have 3313 matches total, with 744 being paternal, 586 being maternal.

Next, click on the maternal or paternal tab to see only the people with X matches who match you on the  your maternal and paternal lines. Matches are automatically sorted into maternal and paternal “buckets” for you. Remember to check the size of the X match before deciding about relevance.

Who is your largest X match that you don’t already know?  Maybe you can find your common ancestor today.

Have fun!!!

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

2017 – The Year of DNA

Every year for the past 17 years has been the year of DNA for me, but for many millions, 2017 has been the year of DNA. DNA testing has become a phenomenon in its own right.

It was in 2013 that Spencer Wells predicted that 2014 would be the “year of infection.” Spencer was right and in 2014 DNA joined the ranks of household words. I saw DNA in ads that year, for the first time, not related to DNA testing or health as in, “It’s in our DNA.”

In 2014, it seemed like most people had heard of DNA, even if they weren’t all testing yet. John Q. Public was becoming comfortable with DNA.

In 2017 – DNA Is Mainstream  

If you’re a genealogist, you certainly know about DNA testing, and you’re behind the times if you haven’t tested.  DNA testing is now an expected tool for genealogists, and part of a comprehensive proof statement that meets the genealogical proof standard which includes “a reasonably exhaustive search.”  If you haven’t applied DNA, you haven’t done a reasonably exhaustive search.

A paper trail is no longer sufficient alone.

When I used to speak to genealogy groups about DNA testing, back in the dark ages, in the early 2000s, and I asked how many had tested, a few would raise their hands – on a good day.

In October, when I asked that same question in Ireland, more than half the room raised their hand – and I hope the other half went right out and purchased DNA test kits!

Consequently, because the rabid genealogical market is now pretty much saturated, the DNA testing companies needed to find a way to attract new customers, and they have.

2017 – The Year of Ethnicity

I’m not positive that the methodology some of the major companies utilized to attract new consumers is ideal, but nonetheless, advertising has attracted many new people to genetic genealogy through ethnicity testing.

If you’re a seasoned genetic genealogist, I know for sure that you’re groaning now, because the questions that are asked by disappointed testers AFTER the results come back and aren’t what people expected find their way to the forums that genetic genealogists peruse daily.

I wish those testers would have searched out those forums, or read my comparative article about ethnicity tests and which one is “best” before they tested.

More ethnicity results are available from vendors and third parties alike – just about every place you look it seems.  It appears that lots of folks think ethnicity testing is a shortcut to instant genealogy. Spit, mail, wait and voila – but there is no shortcut.  Since most people don’t realize that until after they test, ethnicity testing is becoming ever more popular with more vendors emerging.

In the spring, LivingDNA began delivering ethnicity results and a few months later, MyHeritage as well.  Ethnicity is hot and companies are seizing a revenue opportunity.

Now, the good news is that perhaps some of these new ethnicity testers can be converted into genealogists.  We just have to view ethnicity testing as tempting bait, or hopefully, a gateway drug…

2017 – The Year of Explosive Growth

DNA testing has become that snowball rolling downhill that morphed into an avalanche.  More people are seeing commercials, more people are testing, and people are talking to friends and co-workers at the water cooler who decide to test. I passed a table of diners in Germany in July to overhear, in English, discussion about ethnicity-focused DNA testing.

If you haven’t heard of DTC, direct to consumer, DNA testing, you’re living under a rock or maybe in a third world country without either internet or TV.

Most of the genetic genealogy companies are fairly closed-lipped about their data base size of DNA testers, but Ancestry isn’t.  They have gone from about 2 million near the end of 2016 to 5 million in August 2017 to at least 7 million now.  They haven’t said for sure, but extrapolating from what they have said, I feel safe with 7 million as a LOW estimate and possibly as many as 10 million following the holiday sales.

Advertising obviously pays off.

MyHeritage recently announced that their data base has reached 1 million, with only about 20% of those being transfers.

Based on the industry rumble, I suspect that the other DNA testing companies have had banner years as well.

The good news is that all of these new testers means that anyone who has tested at any of the major vendors is going to get lots of matches soon. Santa, it seems, has heard about DNA testing too and test kits fit into stockings!

That’s even better news for all of us who are in multiple data bases – and even more reason to test at all of the 4 major companies who provide autosomal DNA matching for their customers: Family Tree DNA, Ancestry, MyHeritage and 23andMe.

2017 – The Year of Vendor and Industry Churn

So much happened in 2017, it’s difficult to keep up.

  • MyHeritage entered the DNA testing arena and began matching in September of 2016. Frankly, they had a mess, but they have been working in 2017 to improve the situation.  Let’s just say they still have some work to do, but at least they acknowledge that and are making progress.
  • MyHeritage has a rather extensive user base in Europe. Because of their European draw, their records collections and the ability to transfer results into their data base, they have become the 4th vendor in a field that used to be 3.
  • In March 2017, Family Tree DNA announced that they were accepting transfers of both the Ancestry V2 test, in place since May of 2016, along with the 23andMe V4 test, available since November 2013, for free. MyHeritage has since been added to that list. The Family Tree DNA announcement provided testers with another avenue for matching and advanced tools.
  • Illumina obsoleted their OmniExpress chip, forcing vendors to Illumina’s new GSA chip which also forces vendors to use imputation. I swear, imputation is a swear word. Illumina gets the lump of coal award for 2017.
  • I wrote about imputation here, but in a nutshell, the vendors are now being forced to test only about 20% of the DNA locations available on the previous Illumina chip, and impute or infer using statistics the values in the rest of the DNA locations that they previously could test.
  • Early imputation implementers include LivingDNA (ethnicity only), MyHeritage (to equalize the locations of various vendor’s different chips), DNA.Land (whose matching is far from ideal) and 23andMe, who seems, for the most part, to have done a reasonable job. Of course, the only way to tell for sure at 23andMe is to test again on the V5 chip and compare to V3 and V4 chip matches. Given that I’ve already paid 3 times to test myself at 23andMe (V2, 3 and 4), I’m not keen on paying a 4th time for the V5 version.
  • 23andMe moved to the V5 Illumina GSA chip in August which is not compatible with any earlier chip versions.
  • Needless to say, the Illumina chip change has forced vendors away from focusing on new products in order to develop imputation code in order to remain backwards compatible with their own products from an earlier chip set.
  • GedMatch introduced their sandbox area, Genesis, where people can upload files that are not compatible with the traditional vendor files.  This includes the GSA chip results (23andMe V5,) exome tests and others.  The purpose of the sandbox is so that GedMatch can figure out how to work with these files that aren’t compatible with the typical autosomal test files.  The process has been interesting and enlightening, but people either don’t understand or forget that it’s a sandbox, an experiment, for all involved – including GedMatch.  Welcome to living on the genetic frontier!

  • I assembled a chart of who loves who – meaning which vendors accept transfers from which other vendors.

  • I suspect but don’t know that Ancestry is doing some form of imputation between their V1 and V2 chips. About a month before their new chip implementation in May of 2016, Ancestry made a change in their matching routine that resulting in a significant shift in people’s matches.

Because of Ancestry’s use of the Timber algorithm to downweight some segments and strip out others altogether, it’s difficult to understand where matching issues may arise.  Furthermore, there is no way to know that there are matching issues unless you and another individual have transferred results to either Family Tree DNA or GedMatch, neither of which remove any matching segments.

  • Other developments of note include the fact that Family Tree DNA moved to mitochondrial DNA build V17 and updated their Y DNA to hg38 of the human reference genome – both huge undertakings requiring the reprocessing of customer data. Think of both of those updates as housekeeping. No one wants to do it, but it’s necessary.
  • 23andMe FINALLY finished transferring their customer base to the “New Experience,” but many of the older features we liked are now gone. However, customers can now opt in to open matching, which is a definite improvement. 23andMe, having been the first company to enter the genetic genealogy autosomal matching marketspace has really become lackluster.  They could have owned this space but chose not to focus on genealogy tools.  In my opinion, they are now relegated to fourth place out of a field of 4.
  • Ancestry has updated their Genetic Communities feature a couple of times this year. Genetic Communities is interesting and more helpful than ethnicity estimates, but neither are nearly as helpful as a chromosome browser would be.

  • I’m sure that the repeated requests, begging and community level tantrum throwing in an attempt to convince Ancestry to produce a chromosome browser is beyond beating a dead horse now. That dead horse is now skeletal, and no sign of a chromosome browser. Sigh:(
  • The good news is that anyone who wants a chromosome browser can transfer their results to Family Tree DNA or GedMatch (both for free) and utilize a chromosome browser and other tools at either or both of those locations. Family Tree DNA charges a one time $19 fee to access their advanced tools and GedMatch offers a monthly $10 subscription. Both are absolutely worth every dime. The bad news is, of course, that you have to convince your match or matches to transfer as well.
  • If you can convince your matches to transfer to (or test at) Family Tree DNA, their tools include phased Family Matching which utilizes a combination of user trees, the DNA of the tester combined with the DNA of family matches to indicate to the user which side, maternal or paternal (or both), a particular match stems from.

  • Sites to keep your eye on include Jonny Perl’s tools which include DNAPainter, as well as Goran Rundfeldt’s DNA Genealogy Experiment.  You may recall that in October Goran brought us the fantastic Triangulator tool to use with Family Tree DNA results.  A few community members expressed concern about triangulation relative to privacy, so the tool has been (I hope only temporarily) disabled as the involved parties work through the details. We need Goran’s triangulation tool! Goran has developed other world class tools as well, as you can see from his website, and I hope we see more of both Goran and Jonny in 2018.
  • In 2017, a number of new “free” sites that encourage you to upload your DNA have sprung up. My advice – remember, there really is no such thing as a free lunch.  Ask yourself why, what’s in it for them.  Review ALL OF THE documents and fine print relative to safety, privacy and what is going to be done with your DNA.  Think about what recourse you might or might not have. Why would you trust them?

My rule of thumb, if the company is outside of the US, I’m immediately slightly hesitant because they don’t fall under US laws. If they are outside of Europe or Canada, I’m even more hesitant.  If the company is associated with a country that is unfriendly to the US, I unequivocally refuse.  For example, riddle me this – what happens if a Chinese (or fill-in-the-blank country) company violates an agreement regarding your DNA and privacy?  What, exactly, are you going to do about it from wherever you live?

2017 – The Year of Marketplace Apps

Third party genetics apps are emerging and are beginning to make an impact.

GedMatch, as always, has continued to quietly add to their offerings for genetic genealogists, as had DNAGedcom.com. While these two aren’t exactly an “app”, per se, they are certainly primary players in the third party space. I use both and will be publishing an article early in 2018 about a very useful tool at DNAGedcom.

Another application that I don’t use due to the complex setup (which I’ve now tried twice and abandoned) is Genome Mate Pro which coordinates your autosomal results from multiple vendors.  Some people love this program.  I’ll try, again, in 2018 and see if I can make it all the way through the setup process.

The real news here are the new marketplace apps based on Exome testing.

Helix and their partners offer a number of apps that may be of interest for consumers.  Helix began offering a “test once, buy often” marketplace model where the consumer pays a nominal price for exome sequencing ($80), significantly under market pricing ($500), but then the consumer purchases DNA apps through the Helix store. The apps access the original DNA test to produce results. The consumer does NOT receive their downloadable raw data, only data through the apps, which is a departure from the expected norm. Then again, the consumer pays a drastically reduced price and downloadable exome results are available elsewhere for full price.

The Helix concept is that lots of apps will be developed, meaning that you, the consumer, will be interested and purchase often – allowing Helix to recoup their sequencing investment over time.

Looking at the Helix apps that are currently available, I’ve purchased all of the Insitome products released to date (Neanderthal, Regional Ancestry and Metabolism), because I have faith in Spencer Wells and truthfully, I was curious and they are reasonably priced.

Aside from the Insitome apps, I think that the personalized clothes are cute, if extremely overpriced. But what the heck, they’re fun and raise awareness of DNA testing – a good thing! After all, who am I to talk, I’ve made DNA quilts and have DNA clothing too.

Having said that, I’m extremely skeptical about some of the other apps, like “Wine Explorer.”  Seriously???

But then again, if you named an app “I Have More Money Than Brains,” it probably wouldn’t sell well.

Other apps, like Ancestry’s WeRelate (available for smartphones) is entertaining, but is also unfortunately EXTREMELY misleading.  WeRelate conflates multiple trees, generally incorrectly, to suggest to you and another person on your Facebook friends list are related, or that you are related to famous people.  Judy Russell reviews that app here in the article, “No, actually, we’re not related.” No.  Just no!

I feel strongly that companies that utilize our genetic data for anything have a moral responsibility for accuracy, and the WeRelate app clearly does NOT make the grade, and Ancestry knows that.  I really don’t believe that entertaining customers with half-truths (or less) is more important than accuracy – but then again, here I go just being an old-fashioned fuddy dud expecting ethics.

And then, there’s the snake oil.  You knew it was going to happen because there is always someone who can be convinced to purchase just about anything. Think midnight infomercials. The problem is that many consumers really don’t know how to tell snake oil from the rest in the emerging DNA field.

You can now purchase DNA testing for almost anything.  Dating, diet, exercise, your taste in wine and of course, vitamins and supplements. If you can think of an opportunity, someone will dream up a test.

How many of these are legitimate or valid?  Your guess is as good as mine, but I’m exceedingly suspicious of a great many, especially those where I can find no legitimate scientific studies to back what appear to be rather outrageous claims.

My main concern is that the entire DTC testing industry will be tarred by the brush of a few unethical opportunists.

2017 – The Year of Focus on Privacy and Security

With increased consumer exposure comes increased notoriety. People are taking notice of DNA testing and it seems that everyone has an opinion, informed or not.  There’s an old saying in marketing; “Talk about me good, talk about me bad, just talk about me.”

With all of the ads have come a commensurate amount of teeth gnashing and “the-sky-is-falling” type reporting.  Unfortunately, many politicians don’t understand this industry and open mouth only to insert foot – except that most people don’t realize what they’ve done.  I doubt that the politicians even understand that they are tasting toe-jam, because they haven’t taken the time to research and understand the industry. Sound bites and science don’t mix well.

The bad news is that next, the click-bait-focused press picks up on the stories and the next time you see anyone at lunch, they’re asking you if what they heard is true.  Or, let’s hope that they ask you instead of just accepting what they heard as gospel. Hopefully if we’ve learned anything in this past year, it’s to verify, verify, verify.

I’ve been an advocate for a very long time of increased transparency from the testing companies as to what is actually done with our DNA, and under what circumstances.  In other words, I want to know where my DNA is and what it’s being used for.  Period.

Family Tree DNA answered that question succinctly and unquestionably in December.

Bennett Greenspan: “We could probably make a lot of money by selling the DNA data that we’ve been collecting over the years, but we feel that the only person that should have your DNA information is you.  We don’t believe that it should be sold, traded or bartered.”

You can’t get more definitive than that.

DTC testing for genetic genealogy must be a self-regulating field, because the last thing we need is for the government to get involved, attempting to regulate something they don’t understand.  I truly believe government interference by the name of regulation would spell the end of genetic genealogy as we know it today.  DNA testing for genetic genealogy without sharing results is entirely pointless.

I’ve written about this topic in the past, but an update is warranted and I’ll be doing that sometime after the first of the year.  Mostly, I just need to be able to stay awake while slogging through the required reading (at some vendor sites) of page after page AFTER PAGE of legalese😊

Consumers really shouldn’t have to do that, and if they do, a short, concise summary should be presented to them BEFORE they purchase so that they can make a truly informed decision.

Stay tuned on this one.

2017 – The Year of Education

The fantastic news is that with all of the new people testing, a huge, HUGE need for education exists.  Even if 75% of the people who test don’t do anything with their results after that first peek, that still leaves a few million who are new to this field, want to engage and need some level of education.

In that vein, seminars are available through several groups and institutes, in person and online.  Almost all of the leadership in this industry is involved in some educational capacity.

In addition to agendas focused on genetic genealogy and utilizing DNA personally, almost every genealogy conference now includes a significant number of sessions on DNA methods and tools. I remember the days when we were lucky to be allowed one session on the agenda, and then generally not without begging!

When considering both DNA testing and education, one needs to think about the goal.  All customer goals are not the same, and neither are the approaches necessary to answer their questions in a relevant way.

New testers to the field fall into three primary groups today, and their educational needs are really quite different, because their goals, tools and approaches needed to reach those goals are different too.

Adoptees and genealogists employ two vastly different approaches utilizing a common tool, DNA, but for almost opposite purposes.  Adoptees wish to utilize tests and trees to come forward in time to identify either currently living or recently living people while genealogists are interested in reaching backward in time to confirm or identify long dead ancestors. Those are really very different goals.

I’ve illustrated this in the graphic above.  The tester in question uses their blue first cousin match to identify their unknown parent through the blue match’s known lineage, moving forward in time to identify the tester’s parent.  In this case, the grandparent is known to the blue match, but not to the yellow tester. Identifying the grandparent through the blue match is the needed lynchpin clue to identify the unknown parent.

The yellow tester who already knows their maternal parent utilizes their peach second cousin match to verify or maybe identify their maternal great-grandmother who is already known to the peach match, moving backwards in time. Two different goals, same DNA test.

The three types of testers are:

  • Curious ethnicity testers who may not even realize that at least some of the vendors offer matching and other tools and services.
  • Genealogists who use close relatives to prove which sides of trees matches come from, and to triangulate matching segments to specific ancestors. In other words, working from the present back in time. The peach match and line above.
  • Adoptees and parent searches where testers hope to find a parent or siblings, but failing that, close relatives whose trees overlap with each other – pointing to a descendant as a candidate for a parent. These people work forward in time and aren’t interested in triangulation or proving ancestors and really don’t care about any of those types of tools, at least not until they identify their parent.  This is the blue match above.

What these various groups of testers want and need, and therefore their priorities are different in terms of their recommendations and comments in online forums and their input to vendors. Therefore, you find Facebook groups dedicated to Adoptees, for example, but you also find adoptees in more general genetic genealogy groups where genealogists are sometimes surprised when people focused on parent searches downplay or dismiss tools such as Y DNA, mitochondrial DNA and chromosome browsers that form the bedrock foundation of what genealogists need and require.

Fortunately, there’s room for everyone in this emerging field.

The great news is that educational opportunities are abundant now. I’m listing a few of the educational opportunities for all three groups of testers, in addition to my blog of course.😊

Remember that this blog is fully searchable by keyword or phrase in the little search box in the upper right hand corner.  I see so many questions online that I’ve already answered!

Please feel free to share links of my blog postings with anyone who might benefit!

Note that these recommendations below overlap and people may well be interested in opportunities from each group – or all!!

Ethnicity

Adoptees or Parent Search

Genetic Genealogists

2018 – What’s Ahead? 

About midyear 2018, this blog will reach 1000 published articles. This is article number 939.  That’s amazing even to me!  When I created this blog in July of 2012, I wasn’t sure I’d have enough to write about.  That certainly has changed.

Beginning shortly, the tsunami of kits that were purchased during the holidays will begin producing matches, be it through DNA upgrades at Family Tree DNA, Big Y tests which were hot at year end, or new purchases through any of the vendors.  I can hardly wait, and I have my list of brick walls that need to fall.

Family Tree DNA will be providing additional STR markers extracted from the Big Y test. These won’t replace any of the 111 markers offered separately today, because the extraction through NGS testing is not as reliable as direct STR testing for those markers, but the Big Y will offer genealogists a few hundred more STRs to utilize. Yes, I said a few hundred. The exact number has not yet been finalized.

Family Tree DNA says they will also be introducing new “qualify of life improvements” along with new privacy and consent settings.  Let’s hope this means new features and tools will be released too.

MyHeritage says that they are introducing new “Discoveries” pages and a chromosome browser in January.  They have also indicated that they are working on their matching issues.  The chromosome browser is particularly good news, but matching must work accurately or the chromosome browser will show erroneous information.  Let’s hope January brings all three features.

LivingDNA indicates that they will be introducing matching in 2018.

2018 – What Can You Do?

What can you do in 2018 to improve your odds of solving genealogy questions?

  • Test relatives
  • Transfer your results to as many data bases as possible (among the ones discussed above, after reading the terms and conditions, of course)
  • If you have transferred a version of your DNA that does not produce full results, such as the Ancestry V2 or 23andMe V4 test to Family Tree DNA, consider testing on the vendor’s own chip in order to obtain all matches, not just the closest matches available from an incompatible test transfer.
  • Test Y and mitochondrial DNA at Family Tree DNA.
  • Find ways to share the stories of your ancestors.  Stories are cousin bait.  My 52 Ancestors series is living proof.  People find the stories and often have additional facts, information or even photos. Some contacts qualify for DNA testing for Y or mtDNA lines. The GREAT NEWS is that Amy Johnson Crow is resuming the #52Ancestors project for 2018, providing hints and tips each week! Who knows what you might discover by sharing?! Here’s how to start a blog if you need some assistance.  It’s easy – really!
  • Focus on the brick walls that you want to crumble and then put together both a test and analysis plan. That plan could include such things as:

o   Find out if a male representing a Y line in your tree has tested, and if not, search through autosomal results to see if a male from that paternal surname line has tested and would be amenable to an upgrade.

o   Mitochondrial DNA test people who descend through all females from various female ancestors in order to determine their origins. Y and mtDNA tests are an important part of a complete genealogy story – meaning the reasonably exhaustive search!

o   Autosomal DNA test family members from various lines with the hope that matches will match you and them both.

o   Test family members in order to confirm a particular ancestor – preferably people who descend from another child of that ancestor.

o   Making sure your own DNA is in all 4 of the major vendors’ data bases, plus GedMatch. Look at it this way, everyone who is at GedMatch or at a third party (non-testing) site had to have tested at one of the major 4 vendors – so if you are in all of the vendor’s data bases, plus GedMatch, you’re covered.

Have a wonderful New Year and let’s make 2018 the year of newly discovered ancestors and solved mysteries!

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Imputation Matching Comparison

In a future article, I’ll be writing about the process of uploading files to DNA.Land and the user experience, but in this article, I want to discuss only one topic, and that’s the results of imputation as it affects matching for genetic genealogy. DNA.Land is one of three companies known positively to be using imputation (DNA.Land, MyHeritage and LivingDNA), and one of two that allows transfers and does matching for genealogy

This is the second in a series of three articles about imputation.

Imputation, discussed in the article, Concepts – Imputation, is the process whereby your DNA that is tested is then “expanded” by inferring results you don’t have, meaning locations that haven’t been tested, by using information from results you do have. Vendors have no choice in this matter, as Illumina, the chip maker of the DNA chip widely utilized in the genetic genealogy marketspace has obsoleted the prior chip and moved to a new chip with only about 20% overlap in the locations previously tested. Imputation is the methodology utilized to attempt to bridge the gap between the two chips for genetic genealogy matching and ethnicity predications.

Imputation is built upon two premises:

1 – that DNA locations are inherited together

2 – that people from common populations share a significant amount of the same DNA

An example of imputation that DNA.Land provides is the following sentence.

I saw a blue ca_ on your head.

There are several letters that are more likely that others to be found in the blank and some words would be more likely to be found in this sentence than others.

A less intuitive sentence might be:

I saw a blue ca_ yesterday.

DNA.Land doesn’t perform DNA testing, but instead takes a file that you upload from a testing vendor that has around 700,000 locations and imputes another 38.3 million variants, or locations, based on what other people carry in neighboring locations. These numbers are found in the SNPedia instructions for uploading DNA.Land information to their system for usage with Promethease.

I originally wrote about Promethease here, and I’ll be publishing an updated article shortly.

In this article, I want to see how imputation affects matching between people for genetic genealogy purposes.

Genetic Genealogy Matching

In order to be able to do an apples to apples comparison, I uploaded my Family Tree DNA autosomal file to DNA.Land.

DNA.Land then processed my file, imputed additional values, then showed me my matches to other people who have also uploaded and had additional locations imputed.

DNA.Land has just over 60,000 uploads in their data base today. Of those, I match 11 at a high confidence level and one at a speculative level.

My best match, meaning my closest match, Karen, just happened to have used her GedMatch kit number for her middle name. Smart lady!

Karen’s GedMatch number provided me with the opportunity to compare our actual match information at DNA.Land, then also at GedMatch, then compare the two different match results in order to see how much of our matching was “real” from portions of our tested kits that actually match, and what portion of our DNA matches as a result of the DNA.Land imputation.

At DNA.Land, your match information is presented with the following information:

  • Relationship degree – meaning estimated relationship
  • # shared segments – although many of these are extremely small
  • Total shared cM
  • Total recent shared length in cM
  • Longest recent shared segment in cM
  • Relationship likelihood graph
  • Shared segments plotted on chromosome display
  • Shared segments in a table

Please note that you can click on any graphic to enlarge.

DNA.Land provides what they believe to be an accurate estimate of recent and anciently shared SNA segments.

The match table is a dropdown underneath the chromosome graphic at far right:

For this experiment, I copied the information from the match table and dropped it into a spreadsheet.

DNALand Match Locations

My match information is shown at DNA.Land with Karen as follows:

Matching segments are identified by DNA.Land as either recent or ancient, which I find to be over-simplified at best and misleading or inaccurate at worst. I guess it depends on how you perceive recent and ancient. I think they are trying to convey the concept that larger segments tend to me more recent, and smaller segments tend to be older, but ancient in the genetics field often refers to DNA extracted from exhumed burials from thousands of years ago.  Furthermore, smaller segments can be descended from the same ancestor as larger segments.

GedMatch Match

Since Karen so kindly provided her GedMatch kit number, I signed in to GedMatch and did a one-to-one match with this same kit.

Since all of the segments are 3 cM and over at DNA.Land, I utilized a GedMatch threshold of 3 cM and dropped the SNP count to 100, since a SNP count of 300 gave me few matches. For this comparison, I wanted to see all my matches to Karen, no matter how few SNPs are involved, in an attempt to obtain results similar to DNA.Land. I normally would not drop either of these thresholds this low. My typical minimum is 5cM and 500 SNPs, and even if I drop to 3cM, I still maintain the 500 SNP threshold.

Let’s see how the data from GedMatch and DNA.Land compares.

In my spreadsheet, below, I pasted the segment match information from DNA.Land in the first 5 columns with a red header. Note that DNA.Land does not provide the number of shared SNPs.

At right, I pasted the match information from GedMatch, with a green header. We know that GedMatch has a history of accurately comparing segments, and we can do a cross platform comparison. I originally uploaded my FTDNA file to DNA.Land and Karen uploaded an Ancestry file. Those are the two files I compared at GedMatch, because the same actual matching locations are being compared at both vendors, DNA.Land (in addition to imputed regions) and GedMatch.

I then copied the matching segments from GedMatch (3cM, 100 SNPs threshold) and placed them in the middle columns in the same row where they matched corresponding DNA.Land segments. If any portion of the two vendors segments overlapped, I copied them as a match, although two are small and partial and one is almost negligible. As you can see, there are only 10 segments with any overlap at all in the center section. Please note that I am NOT suggesting these are valid or real matches.  At this point, it’s only a math/match exercise, not an analysis.

The match comparison column (yellow header) is where I commented on the match itself. In some cases, the lack of the number of SNPs at DNA.Land was detrimental to understanding which vendor was a higher match. Therefore, when possible, I marked the higher vendor in the Match Comparison column with the color of their corresponding header.

Analysis

Frankly, I was shocked at the lack of matching between GedMatch and DNA.Land. Trying to understand the discrepancy, I decided to look at the matches between Karen, who has been very helpful, and me at other vendors.

I then looked at our matches at Ancestry, 23andMe, MyHeritage and at Family Tree DNA.

The best comparison would be at Family Tree DNA where Karen loaded her Ancestry file.  Therefore, I’m comparing apples to apples, meaning equivalent to the comparison at GedMatch and DNA.Land (before imputation).

It’s impossible to tell much without a chromosome browser at Ancestry, especially after Timber processing which reduces matching DNA.

DNA.Land categorized my match to Karen as “high certainty.” My match with Karen appears to be a valid match based on the longest segment(s) of approximately 30cM on chromosome 8.

  • Of the 4 segments that DNA.Land identifies as “recent” matches, 2 are not reflected at all in the GedMatch or Family Tree DNA matching, suggesting that these regions were imputed entirely, and incorrectly.
  • Of the 4 segments that DNA.Land identifies as “recent” matches, the 2 on chromosome 8 are actually one segment that imputation apparently divided. According to DNA.LAND, imputation can increase the number of matching segments. I don’t think it should break existing segments, meaning segments actually tested, into multiple pieces. In any event, the two vendors do agree on this match, even though DNA.Land breaks the matching segment into two pieces where GedMatch and Family Tree DNA do not. I’m presuming (I hate that word) that this is the one segment that Ancestry calls as a match as well, because it’s the longest, but Ancestry’s Timber algorithm downgrades the match portion of that segment by removing 11cM (according to DNA.Land) from 29cM to 18cM or removes 13cM (according to both GedMatch and Family Tree DNA) from 31cM to 18cM. Both GedMatch and Family Tree DNA agree and appear to be accurate at 31cM.
  • Of the total 39 matching segments of any size, utilizing the 3cM threshold and 100 SNPs, which I set artificially very low, GedMatch only found 10 matching segments with any portion of the segment in common, meaning that at least 29 were entirely erroneous matches.
  • Resetting the GedMatch match threshold to 3 cM and 300 SNPS, a more reasonable SNP threshold for 3cM, GedMatch only reports 3 matching segments, one of which is chromosome 8 (undivided) which means at this threshold, 36 of the 39 matching DNA.Land segments are entirely erroneous. Setting the threshold to a more reasonable 5cM or 7cM and 500 SNPs would result in only the one match on chromosome 8.

  • If 29 of 39 segments (at 3cM 100 SNPs) are erroneously reported, that equates to 74.36% erroneous matches due to imputation alone, with out considering identical by chance (IBC) matches.
  • If 35 of 39 segments (at 3cM 300 SNPs) are erroneously reported, that equates to 89.74% percent erroneous matches, again without considering those that might be IBC.

Predicted vs Actual

One additional piece of information that I gathered during this process is the predicted relationship.

Vendor Total cM Total Segments Longest Segment Predicted Relationship
DNA.Land 162 to 3 cM 39 to 3 cM 17.3 & 12, split 3C
GedMatch 123 to 3 cM 27 to 3 cM 31.5 5.1 gen distant
Family Tree DNA 40 to 1 cM 12 to 1 cM 32 3-5C
MyHeritage No match No match No match No match
Ancestry 18.1 1 18.1 5-8C
23andMe 26 1 26 3-6C

Karen utilized her Ancestry file and I used my Family Tree DNA file for all of the above matching except at 23andMe and Ancestry where we are both tested on the vendors’ platform. Neither 23andMe nor Ancestry accept uploads. I included the 23andMe and Ancestry comparisons as additional reference points.

The lack of a match at MyHeritage, another company that implements imputation, is quite interesting. Karen and I, even with a significantly sized segment are not shown as a match at MyHeritage.

If imputation actually breaks some matching segments apart, like the chromosome 8 segment at DNA.Land, it’s possible that the resulting smaller individual segments simply didn’t exceed the MyHeritage matching threshold. It would appear that the MyHeritage matching threshold is probably 9cM, given that my smallest segment match of all my matches at MyHeritage is 9cM. Therefore, a 31 or 32 cM segment would have to be broken into 4 roughly equally sized pieces (32/4=8) for the match to Karen not to be detected because all segment pieces are under 9cM. MyHeritage has experienced unreliable matching since their rollout in mid 2016, so their issue may or may not be imputation related.

The Common Ancestor

At Family Tree DNA, Karen does not match my mother, so I can tell positively that she is related through my father’s line. She and I triangulate on our common segment with three other individuals who descend from Abraham Estes 1647-1720 .

Utilizing the chromosome browser, we do indeed match on chromosome 8 on a long segment, which is also our only match over 5cM at Family Tree DNA.

Based on our trees as well as the trees of our three triangulated Estes matches, Karen and I are most probably either 8th cousins, or 8th cousins once removed, assuming that is our only common line. I am 8th cousins with the other three triangulated matches on chromosome 8. Karen’s line has yet to be proven.

Imputation Matching Summary

I like the way that DNA.Land presents some of their features, but as for matching accuracy, you can view the match quality in various ways:

  1. DNA.Land did find the large match on chromosome 8. Of course, in terms of matching, that’s pretty difficult to miss at roughly 30cM, although MyHeritage managed. Imputation did split the large match into two, somehow, even though Karen and I match on that same segment as one segment at other vendors comparing the same files.
  2. Of the 39 DNA.Land total matches, other than the chromosome 8 match, two other matches are partial matches, according to GedMatch. Both are under 7cM.
  3. Of DNA.Land’s total 39 matches, 35 are entirely wrong, in addition to the two that are split, including two inaccurate imputed matches at over 5cM.
  4. At DNA.Land, I’m not so concerned about discerning between “real” and “false” small segment matches, as compared to both FTDNA and GedMatch, as I am about incorrectly imputed segments and matches. Whether small matches in general are false positives or legitimate can be debated, each smaller segment match based on its own merits. Truthfully, with larger segments to deal with, I tend to ignore smaller segments anyway, at least initially. However, imputation adds another layer of uncertainty on top of actual matching, especially, it appears, with smaller matches. Imputing entire segments of incorrect DNA concerns me.
  5. Having said that, I find it very concerning that MyHeritage who also utilizes imputation missed a significant match of over 30cM. I don’t know of a match of this size that has ever been proven to be a false match (through parental phasing), and in this case, we know which ancestor this segment descends from through independent verification utilizing multiple other matches. MyHeritage should have found that match, regardless of imputation, because that match is from portions of the two files that were both tested, not imputed.

Summary

To date, I’m not impressed with imputation matching relative to genetic genealogy at either DNA.Land or MyHeritage.

In one case, that of DNA.Land, imputation shows matches for segments that are not shown as matches at either Family Tree DNA or GedMatch who are comparing the same two testers’ files, but without imputation. Since DNA.Land did find the larger segment, and many of their smaller segments are simply wrong, I would suggest that perhaps they should only show larger segments. Of course, anyone who finds DNA.Land is probably an experienced genetic genealogist and probably already has files at both GedMatch and Family Tree DNA, so hopefully savvy enough to realize there are issues with DNA.Land’s matching.

In the second imputation case, that of MyHeritage, the match with Karen is missed entirely, although that may not be a function of imputation. It’s hard to determine.  MyHeritage is also comparing the same two files uploaded by Karen and I to the other vendors who found that match, both vendors who do and don’t utilize imputation.

Regardless of imputing additional locations, MyHeritage should have found the matching segment on chromosome 8 because that region does NOT need to be imputed. Their failure to do so may be a function of their matching routine and not of imputation itself. At this point, it’s impossible to discern the cause. We only know, based on matching at other vendors, that the non-match at MyHeritage is inaccurate.

Here’s what DNA.Land has to say about the imputed VCF file, which holds all of your imputed values, when you download the file. They pull no punches about imputation.

“Noisey and probabilistic.” Yes, I’d say they are right, and problematic as well, at least for genetic genealogists.

Extrapolating this even further, I find it more than a little frightening that my imputed data at DNA.Land will be utilized for medical research.

Quoting now from Promethease, a medical reference site that allows the consumer to upload their raw data files, providing consumers with a list of SNPs having either positive or negative research in academic literature:

DNA.land will take a person’s data as produced by such companies and impute additional variants based on population frequency statistics. To put this in concrete terms, a person uploading a typical 23andMe file of ~700,000 variants to DNA.land will get back an (imputed) file of ~39 million variants, all predicted to be present in the person. Promethease reports from such imputed files typically contain about 50% more information (i.e. 50% more genotypes) than the corresponding reports from raw (non-imputed) data.

Translated, this means that your imputed data provides twice as much “genetic information” as your actual tested data. The question remains, of course, how much of this imputed data is accurate.

That will be the topic of the third imputation article. Stay tuned.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Concepts – Why Genetic Genealogy and Triangulation?

One of the questions often asked is why triangulation in genetic genealogy is so important.

Before I answer that, let’s take a look at why genealogists use autosomal DNA for genetic genealogy in the first place.

Why Genetic Genealogy?

Aside from ethnicity testing, genetic genealogists utilize autosomal DNA testing to further their genealogical research or confirm the research they have already performed. Genetic genealogy cannot stand alone on DNA evidence, but must include traditional genealogical research. DNA is simply another tool in the genealogist’s tool box – albeit a critical one.

There are three established primary vendors in this field, Family Tree DNA, Ancestry and 23andMe, plus a few newcomers. All three vendors offer autosomal DNA tests utilized by genetic genealogists in various ways. If you want to learn more about the differences between these vendors’ offerings, please read the article, “Which DNA Test is Best?”

In order to achieve genealogical goals, there are four criteria that need to be met. All are required to achieve triangulation which is the only way to confirm a genealogical ancestral match to a specific ancestor.

  • DNA Matching – The tester’s DNA matches that of other testers at the company where they tested, or at GedMatch. All three vendors provide matching information, along with GedMatch, a third-party tool utilized by genetic genealogists.

Family Tree DNA assigns matches to either maternal, paternal or both sides of the tester’s tree based on connecting the DNA of relatives, up through third cousins, who have tested to their appropriate location in the tester’s tree.

In the example above, you can see the individuals linked to my tree include my mother with her Family Finder test, plus her two first cousins, Donald and Cheryl Ferverda who have also tested.

  • Ancestor Matching – The testers identify a common ancestor or ancestral line based on their previous work, aka, genealogy and family trees.  In the example above, the common ancestors are the parents of the brothers, John and Roscoe Ferverda.  Identifying a common ancestor is an easy task with known close relatives, but becomes more challenging the more distant the common ancestor.

Of the vendors, 23andMe does not have a Gedcom upload or ability for testers to display trees and for the vendor to utilize to match surnames, although they can link to external trees. Ancestry provides “tree matching,” shown above, and Ancestry and Family Tree DNA, shown below, both provide surname matching.

  • Segment Matching – Utilizing chromosome browsers or downloaded match lists including segment information to identify actual DNA segments that match other testers.

Family Tree DNA’s chromosome browser is shown above.

Each individual tester will have two groups of matches on the same segment, one group from their mother’s side of the tree and one from their father’s side of the tree. Each tester carries DNA inherited from both parents on two different “sides” of each chromosome. You can read more about that in the article, One Chromosome, Two Sides, No Zipper – ICW and the Matrix.

Of the three vendors, Ancestry does not provide segment matching, a chromosome browser, nor any segment information, so testers cannot perform this step at Ancestry.

23andMe does provide this information, but each tester must individually “opt in” to data sharing, and many do not. If testers do not globally “opt in” they must authorize sharing individually for every match, so testers will not be able to see the chromosome segment information for many 23andMe matches. In my case, only about 60% are sharing.

Family Tree DNA provides a chromosome browser, the file download capability with segment information, and everyone authorizes sharing of information when they initially test – so there is no opt-in confusion.

Ancestry and 23andMe raw DNA data files can be transferred to both Family Tree DNA and GedMatch where chromosome browsers and other tools are available. For more information about transferring files, please read Autosomal DNA Transfers – Which Companies Accept Which Tests?

Triangulation – The process used to combine all three of the above steps in order to assign specific segments of the tester’s DNA to specific ancestors, by virtue of:

  • The tester’s DNA matching the DNA of other testers on a specific segment.
  • Identifying that the individuals who match the tester on that segment also match each other. This is part of the methodology employed to group the testers matches into two groups, the maternal and paternal groupings.
  • Identifying which ancestor contributed that segment to all of the people who match the tester and each other on that same segment.

In order for a group of matches to triangulate, they must match each other on the same segment of DNA and they must all share a common ancestor.

Triangulation is part DNA, meaning the inheritance, part technology, meaning the ability to show that all testers in a match group all match each other and on the same segment, and part genealogy, meaning the ability to identify the common ancestor of the group of individuals.

The following chart shows a portion of my match download file on chromosome 5 from Family Tree DNA.

As you can see, these matches all cover significant portions of the same segment on chromosome 5.

Without further investigation, we know that I match all of these people, but we don’t know what that information is telling us about my genealogy. We don’t know who matches each other, and we can’t tell which people are from my mother’s and father’s sides. We also don’t know who the common ancestor is or common ancestors are.

However, looking at the trees of the individuals involved, or contacting them for further information, and/or recognizing known cousins from a specific line all combine to contribute to the identification of our common ancestors.

Below is the same spreadsheet, now greatly enriched after my genealogy work is applied to the DNA matches in two additional columns.

I’ve colored my triangulated groups pink for my mother’s side and blue for my father’s side.

In this case, I also have access to my cousins’ DNA match results, so I can view their matches as well, looking for common matches on my match list.

One of the reasons genealogists always suggest testing older family members and as many cousins as possible is because triangulation becomes much easier with known cousins from particular lines to point the way to the common ancestor. In this case, one cousin, Joe, is from my mother’s side and one, Lou, is from my father’s side.

By looking at my matches’ genealogy, I’ve now been able to assign this particular segment on chromosome 5, on my mother’s side to ancestors Johann Michael Miller and his wife Susanna Berchtol. The same segment, on my father’s side is inherited from Charles Dodson and his wife, Ann, last name unknown.

In order to achieve triangulation, the common ancestor must be determined for the match group. Once triangulation is achieved, descent from the common ancestor is confirmed.

Unless you are dealing with very close known relatives, like the Ferverda first cousins, there is no other way to prove a genetic connection to a specific ancestor.

At Family Tree DNA, I can utilize the chromosome browser and the ICW and matrix tools to determine which of this group matches each other. At 23andMe, I can utilize their shared DNA matching tool. This information can then be recorded in my DNA spreadsheet, as illustrated above.

Triangulation cannot be achieved at Ancestry or utilizing their tools. Ancestry’s DNA Circles provide extended match groups, indicating who matches whom for a particular ancestor shown in a tester’s tree, but do not indicate that the matches are on the same segment. Circles do not guarantee that Circle members are matching on DNA from that ancestor, only that they do match and show a common ancestor in their tree.  The third triangulation step of segment matching is missing.  Ancestry does not provide segment information in any format, so Ancestry customers who want to triangulate can either retest elsewhere or download their data files to either Family Tree DNA or GedMatch for free.

Summary

Before the advent of genetic genealogy, genealogists had to take it on faith that the paper trail was accurate, and that there was no misattributed parentage – either through formal or informal adoption or hanky-panky.  That’s not the case anymore.

Today, DNA through triangulation can prove ancestry for groups of people to a common ancestor by identifying segments that have descended from that ancestor and are found in multiple descendants today.

Of course, the next step is to break down those remaining brick walls. For example, what is the birth name of Ann, wife of Charles Dodson, whose surname is unknown? Logically, the DNA descended from a couple, meaning Charles and Ann, contains DNA from both individuals. We don’t know if that segment on chromosome 5 is from Ann, Charles, or parts from both, BUT, if we begin to see a further breakdown to another, unknown family line among the Charles and Ann segments, that might be a clue.

One day, in the future, we’ll be able to identify our unknown family lines through DNA matches and other people’s triangulation. That indeed, is the Holy Grail.

Additional Resources

If you’d like to read more specific information about autosomal DNA matching and triangulation, be sure to read the links in the article, above. The following articles may be of interest as well:

If you think you might come up short, because you have only one known cousin who has tested, well, think again.

Here’s wishing you lots of triangulated matches!!!

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Autosomal DNA Transfers – Which Companies Accept Which Tests?

Somehow, I missed the announcement that Family Tree DNA now accepts uploads from MyHeritage.

Other people may have missed a few announcements too, or don’t understand the options, so I’ve created a quick and easy reference that shows which testing vendors’ files can be uploaded to which other vendors.

Why Transfer?

Just so that everyone is on the same page, if you test your autosomal DNA at one vendor, Vendor A, some other vendors allow you to download your raw data file from Vendor A and transfer your results to their company, Vendor B.  The transfer to Vendor B is either free or lower cost than testing from scratch.  One site, GedMatch, is not a testing vendor, but is a contribution/subscription comparison site.

Vendor B then processes your DNA file that you imported from Vendor A, and your results are then included in the database of Vendor B, which means that you can obtain your matches to other people in Vendor B’s data base who tested there originally and others who have also transferred.  You can also avail yourself of any other tools that Vendor B provides to their customers.  Tools vary widely between companies.  For example, Family Tree DNA, GedMatch and 23andMe provide chromosome browsers, while Ancestry does not.  All 3 major vendors (Family Tree DNA, Ancestry and 23andMe) have developed unique offerings (of varying quality) to help their customers understand the messages that their unique DNA carries.

Ok, Who Loves Whom?

The vendors in the left column are the vendors performing the autosomal DNA tests. The vendor row (plus GedMatch) across the top indicates who accepts upload transfers from whom, and which file versions. Please consider the notes below the chart.

(Chart updated September 28, 2017)

Please note that on August 9, 2017, 23and Me began processing on the Illumina GSA chip which is not compatible with earlier versions.  As of late September 2017, only GedMatch accepts their upload and only in their Genesis sandbox area, not the normal production matching area.  This is due to the small overlap area with existing chips.  You can read more about the GSA chip and its ramifications here

  • Family Tree DNA accepts uploads from both other major vendors (Ancestry and 23andMe) but the versions that are compatible with the chip used by FTDNA will have more matches at Family Tree DNA. 23andMe V3, Ancestry V1 and MyHeritage results utilize the same chip and format as FTDNA. 23andMe V4 and Ancestry V2 utilize different formats utilizing only about half of the common locations. Family Tree DNA still allows free transfers and comparisons with other testers, but since there are only about half of the same DNA locations in common with the FTDNA chip, matches will be fewer. Additional functions can be unlocked for a one time $19 fee.
  • Neither Ancestry, 23andMe nor Genographic accept transfer data from any other vendors.
  • MyHeritage does accept transfers, although that option is not easy to find. I checked with a MyHeritage representative and they provided me with the following information:  “You can upload an autosomal DNA file from your profile page on MyHeritage. To access your profile page, login to your MyHeritage account, then click on your name which is displayed towards the top right corner of the screen. Click on “My profile”. On the profile page you’ll see a DNA tab, click on the tab and you’ll see a link to upload a file.”  MyHeritage has also indicated that they will be making ethnicity results available to individuals who transfer results into their system in May, 2017.
  • LivingDNA has just released an ethnicity product and does not have DNA matching capability to other testers.  Living DNA imputes DNA locations that they don’t test, but the initial download only includes the DNA locations actually tested.
  • WeGene’s website is in Chinese and they are not a significant player, but I did include them because GedMatch accepts their files. WeGene’s website indicates that they accept 23andme uploads, but I am unable to determine which version or versions. Given that their terms and conditions and privacy and security information are not in English, I would be extremely hesitant before engaging in business. I would not be comfortable in trusting on online translation for this type of document. SNPedia reports that WeGene has data quality issues.
  • GedMatch is not a testing vendor, so has no entry in the left column, but does provide tools and accepts all versions of files from each vendor that provides files, to date, with the exception of the Genographic Project.  GedMatch is free (contribution based) for many features, but does have more advanced functions available for a $10 monthly subscription. The GedMatch Genesis platform is a sandbox area for files from vendors that cannot be put into production today due to matching and compatibility issues.
  • The Genographic Project tested their participants at the Family Tree DNA lab until November 2016, when they moved to the Helix platform, which performs an exome test using a different chip.
  • The Ancestry V2 chip began processing in May 2016.
  • The 23andMe V3 chip began processing in December 2010. The 23andMe V4 chip began processing in November 2013. Their V5 chip August 9, 2017.

Incompatible Files

Please be aware that vendors that accept different versions of other vendors files can only work with the tested locations that are in the files generated by the testing vendors unless they use a technique called imputation.

For example, Family Tree DNA tests about 700,000 locations which are on the same chip as MyHeritage, 23andMe V3 and Ancestry V1. In the later 23andMe V4 test, the earlier 23andMe V2 and the Ancestry V2 tests, only a portion of the same locations are tested.  The 23andMe V4 and Ancestry V2 chips only test about half of the file locations of the vendors who utilize the Illumina OmniExpress chip, but not the same locations as each other since both the Ancestry V2 and 23andMe V4 chips are custom. 23andMe and Ancestry both changed their chips from the OmniExpress version and replaced genealogically relevant locations with medically relevant locations, creating a custom chip.

Update:  In August 2017, 23andMe introduced their V5 chip which has only about 20% overlap with previous chips.

I know this is confusing, so I’ve created the following chart for chip and test compatibility comparison.

(Chart updated Sept. 28, 2017)

You can easily see why the FTDNA, Ancestry V1, 23andMe V3 and MyHeritage tests are compatible with each other.  They all tested utilizing the same chip.  However, each vendor then applies their own unique matching and ethnicity algorithms to customer results, so your results will vary with each vendor, even when comparing ethnicity predictions or matching the same two individuals to each other.

Apples to Apples to Imputation

It’s difficult for vendors to compare apples to apples with non-compatible files.

I wrote about imputation in the article about MyHeritage, here and also more generally, here. In a nutshell, imputation is a technique used to infer the DNA for locations a vendor doesn’t test (or doesn’t receive in a transfer file from another vendor) based on the location’s neighboring DNA and DNA that is “normally” passed together as a packet.

However, the imputed regions of DNA are not your DNA, and therefore don’t carry your mutations, if any.

I created the following diagram when writing the MyHeritage article to explain the concept of imputation when comparing multiple vendors’ files showing locations tested, overlap and imputed regions. You can click to enlarge the graphic.

Family Tree DNA has chosen not to utilize imputation for transfer files and only compares the actual DNA locations tested and uploaded in vendor files, while MyHeritage has chosen to impute locations for incompatible files. Family Tree DNA produces fewer, but accurate matches for incompatible transfer files.  MyHeritage continues to have matching issues.

MyHeritage may be using imputation for all transfer files to equalize the files to a maximum location count for all vendor files. This is speculation on my part, but is speculation based on the differences in matches from known compatible file versions to known matches at the original vendor and then at MyHeritage.

I compared matches to the same person at MyHeritage, GedMatch, Ancestry and Family Tree DNA. It appears that imputed matches do not consistently compare reliably. I’m not convinced imputation can ever work reliably for genetic genealogy, because we need our own DNA and mutations. Regardless, imputation is in its infancy today and due to the Illumina GSA chip replacing the OmniExpress chip, imputation will be widely used within the industry shortly for backwards compatibility.

To date, two vendors are utilizing imputation. LivingDNA is using imputation with the GSA chip for ethnicity, and MyHeritage for DNA matching.

Summary

Your best results are going to be to test on the platform that the vendor offers, because the vendor’s match and ethnicity algorithms are optimized for their own file formats and DNA locations tested.

That means that if you are transferring an Ancestry V1 file, a 23andMe V3 file or a MyHeritage file, for example, to Family Tree DNA, your matches at Family Tree DNA will be the same as if you tested on the FTDNA platform.  You do not need to retest at Family Tree DNA.

However, if you are transferring an Ancestry V2 file or 23andMe V4 file, you will receive some matches, someplace between one quarter and half as compared to a test run on the vendor’s own chip. For people who can’t be tested again, that’s certainly better than nothing, and cross-chip matching generally picks up the strongest matches because they tend to match in multiple locations. For people who can retest, testing at Family Tree DNA would garner more matches and better ethnicity results for those with 23andMe V2 and V4 tests as well as Ancestry V2 tests.

For absolutely best results, swim in all of the major DNA testing pools, test as many relatives as possible, and test on the vendor’s Native chip to obtain the most matches.  After all, without sharing and matching, there is no genetic genealogy!

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

2016 Genetic Genealogy Retrospective

In past years, I’ve written a “best of” article about genetic genealogy happenings throughout the year. For several years, the genetic genealogy industry was relatively new, and there were lots of new tools being announced by the testing vendors and others as well.

This year is a bit different. I’ve noticed a leveling off – there have been very few announcements of new tools by vendors, with only a few exceptions.  I think genetic genealogy is maturing and has perhaps begun a new chapter.  Let’s take a look.

Vendors

Family Tree DNA

Family Tree DNA leads the pack this year with their new Phased Family Matches which utilizes close relatives, up to third cousins, to assign your matches to either maternal or paternal buckets, or both if the individual is related on both sides of your tree.

Both Buckets

They are the first and remain the only vendor to offer this kind of feature.

Phased FF2

Phased Family Matching is extremely useful in terms of identifying which side of your family tree your matches are from. This tool, in addition to Family Tree DNA’s nine other autosomal tools helps identify common ancestors by showing you who is related to whom.

Family Tree DNA has also added other features such as a revamped tree with the ability to connect DNA results to family members.  DNA results connected to the tree is the foundation for the new Phased Family Matching.

The new Ancient Origins feature, released in November, was developed collaboratively with Dr. Michael Hammer at the University of Arizona Hammer Lab.

Ancient European Origins is based on the full genome sequencing work now being performed in the academic realm on ancient remains. These European results fall into three primary groups of categories based on age and culture.  Customer’s DNA is compared to the ancient remains to determine how much of the customer’s European DNA came from which group.  This exciting new feature allows us to understand more about our ancestors, long before the advent of surnames and paper or parchment records. Ancient DNA is redefining what we know, or thought we knew, about population migration.

2016-ancient-origins

You can view Dr. Hammer’s presentation given at the Family Tree DNA Conference in conjunction with the announcement of the new Ancient Origins feature here.

Family Tree DNA maintains its leadership position among the three primary vendors relative to Y DNA testing, mtDNA testing and autosomal tools.

Ancestry

In May of 2016, Ancestry changed the chip utilized by their tests, removing about 300,000 of their previous 682,000 SNPs and replacing them with medically optimized SNPs. The rather immediate effect was that due to the chip incompatibility, Ancestry V2 test files created on the new chip cannot be uploaded to Family Tree DNA, but they can be uploaded to GedMatch.  Family Tree DNA is working on a resolution to this problem.

I tested on the new Ancestry V2 chip, and while there is a difference in how much matching DNA I share with my matches as compared to the V1 chip, it’s not as pronounced as I expected. There is no need for people who tested on the earlier chip to retest.

Unfortunately, Ancestry has remained steadfast in their refusal to implement a chromosome browser, instead focusing on sales by advertising the ethnicity “self-discovery” aspect of DNA testing.

Ancestry does have the largest autosomal data base but many people tested only for ethnicity, don’t have trees or have private trees.  In my case, about half of my matches fall into that category.

Ancestry maintains its leadership position relative to DNA tree matching, known as a Shared Ancestor Hint, identifying common ancestors in the trees of people whose DNA matches.

ancestry-common-ancestors

23andMe

23andMe struggled for most of the year to meet a November 2015 deadline, which is now more than a year past, to transition its customers to the 23andMe “New Experience” which includes a new customer interface. I was finally transitioned in September 2016, and the experience has been very frustrating and extremely disappointing, and that’s putting it mildly. Some customers, specifically international customers, are still not transitioned, nor is it clear if or when they will be.

I tested on the 23andMe older V3 chip as well as their newer V4 chip. After my transition to the New Experience, I compared the results of the two tests. The new security rules incorporated into the New Experience meant that I was only able to view about 25% of my matches (400 of 1651(V3) matches or 1700 (V4) matches). 23andMe has, in essence, relegated themselves into the non-player status for genetic genealogy, except perhaps for adoptees who need to swim in every pool – but only then as a last place candidate. And those adoptees had better pray that if they have a close match, that match falls into the 25% of their matches that are useful.

In December, 23andMe began providing segment information for ethnicity segments, except the parental phasing portion does not function accurately, calling into question the overall accuracy of the 23andme ethnicity information. Ironically, up until now, while 23andMe slipped in every other area, they had been viewed at the best, meaning most accurate, in terms of ethnicity estimates.

New Kids on the Block

MyHeritage

In May of 2016, MyHeritage began encouraging people who have tested at other vendors to upload their results. I was initially very hesitant, because aside from GedMatch that has a plethora of genetic genealogy tools, I have seen no benefit to the participant to upload their DNA anyplace, other than Family Tree DNA (available for V3 23andMe and V1 Ancestry only).

Any serious genealogist is going to test at least at Family Tree DNA and Ancestry, both, and upload to GedMatch. My Heritage was “just another upload site” with no tools, not even matching initially.

However, in September, MyHeritage implemented matching, although they have had a series of what I hope are “startup issues,” with numerous invalid matches, apparently resulting from their usage of imputation.

Imputation is when a vendor infers what they think your DNA will look like in regions where other vendors test, and your vendor doesn’t. The best example would be the 300,000 or so Ancestry locations that are unique to the Ancestry V2 chip. Imputation would result in a vendor “inferring” or imputing your results for these 300,000 locations based on…well, we don’t exactly know based on what. But we do know it cannot be accurate.  It’s not your DNA.

In the midst of this, in October, 23andMe announced on their forum that they had severed a previous business relationship with MyHeritage where 23andMe allowed customers to link to MyHeritage trees in lieu of having customer trees directly on the 23andMe site.  This approach had been problematic because customers are only allowed 250 individuals in their tree for free, and anything above that requires a MyHeritage subscription.  Currently 23andMe has no tree capability.

It appears that MyHeritage refined their DNA matching routines at least somewhat, because many of the bogus matches were gone in November when they announced that their beta was complete and that they were going to sell their own autosomal DNA tests. However, matching issues have not disappeared or been entirely resolved.

While Family Tree DNA’s lab will be processing the MyHeritage autosomal tests, the results will NOT be automatically placed in the Family Tree DNA data base.

MyHeritage will be doing their own matching within their own database. There are no comparison tools, tree matching or ethnicity estimates today, but My Heritage says they will develop a chromosome browser and ethnicity estimates. However, it is NOT clear whether these will be available for free to individuals who have transferred their results into MyHeritage or if they will only be available to people who tested through MyHeritage.

2016-myheritage-matches

For the record, I have 28 matches today at MyHeritage.

2016-myheritage-second-match

I found that my second closest match at MyHeritage is also at Ancestry.

2016-myheritage-at-ancestry

At MyHeritage, they report that I match this individual on a total of 64.1 cM, across 7 segments, with the largest segment being 14.9 cM.

Ancestry reports this same match at 8.3 cM total across 1 segment, which of course means that the longest segment is also 8.3 cM.

Ancestry estimates the relationship as 5th to 8th cousin, and MyHeritage estimates it as 2nd to 4th.

While I think Ancestry’s Timber strips out too much DNA, there is clearly a HUGE difference in the reported results and the majority of this issue likely lies with the MyHeritage DNA imputation and matching routines.

I uploaded my Family Tree DNA autosomal file to MyHeritage, so MyHeritage is imputing at least 300,000 SNPs for me – almost half of the SNPs needed to match to Ancestry files.  They are probably imputing that many for my match’s file too, so that we have an equal number of SNPs for comparison.  Combined, this would mean that my match and I are comparing 382,000 actual SNPs that we both tested, and roughly 600,000 SNPs that we did not test and were imputed.  No wonder the MyHeritage numbers are so “off.”

My Heritage has a long way to go before they are a real player in this arena. However, My Heritage has potential, as they have a large subscriber base in Europe, where we desperately need additional testers – so I’m hopeful that they can attract additional genealogists that are willing to test from areas that are under-represented to date.

My Heritage got off to a bit of a rocky start by requiring users to relinquish the rights to their DNA, but then changed their terms in May, according to Judy Russell’s blog.

All vendors can change their terms at any time, in a positive or negative direction, so I would strongly encourage all individuals considering utilizing any testing company or upload service to closely read all the legal language, including Terms and Conditions and any links found in the Terms and Conditions.

Please note that MyHeritage is a subscription genealogy site, similar to Ancestry.  MyHeritage also owns Geni.com.  One site, MyHeritage, allows individual trees and the other, Geni, embraces the “one world tree” model.  For a comparison of the two, check out Judy Russell’s articles, here and here.  Geni has also embraced DNA by allowing uploads from Family Tree DNA of Y, mitochondrial and autosomal, but the benefits and possible benefits are much less clear.

If the MyHeritage story sounds like a confusing soap opera, it is.  Let’s hope that 2017 brings both clarity and improvements.

Living DNA

Living DNA is a company out of the British Isles with a new test that purports to provide you with a breakdown of your ethnicity and the locations of your ancestral lines within 21 regions in the British Isles.  Truthfully, I’m very skeptical, but open minded.

They have had my kit for several weeks now, and testing has yet to begin.  I’ll write about the results when I receive them.  So far, I don’t know of anyone who has received results.

2016-living-dna

Genos

I debated whether or not I should include Genos, because they are not a test for genealogy and are medically focused. However, I am including them because they have launched a new model for genetic testing wherein your full exome is tested, you receive the results along with information on the SNPs where mutations are found. You can then choose to be involved with research programs in the future, if you wish, or not.

That’s a vastly different model that the current approach taken by 23andMe and Ancestry where you relinquish your rights to the sale of your DNA when you sign up to test.  I like this new approach with complete transparency, allowing the customer to decide the fate of their DNA. I wrote about the Genos test and the results, here.

Third Parties

Individuals sometimes create and introduce new tools to assist genealogists with genetic genealogy and analysis.

I have covered these extensively over the years.

GedMatch, WikiTree, DNAGedcom.com and Kitty Cooper’s tools remain my favorites.

I love Kitty’s Ancestor Chromosome Mapper which maps the segments identified with your ancestors on your chromosomes. I just love seeing which ancestors’ DNA I carry on which chromosomes.  Somehow, this makes me feel closer to them.  They’re not really gone, because they still exist in me and other descendants as well.

Roberta's ancestor map2

In order to use Kitty’s tool, you’ll have to have mapped at least some of your autosomal DNA to ancestors.

The Autosomal DNA Segment Analyzer written by Don Worth and available at DNAGedcom is still one of my favorite tools for quick, visual and easy to understand segment matching results.

ADSA Crumley cluster

GedMatch has offered a triangulation tool for some time now, but recently introduced a new Triangulation Groups tool.

2016-gedmatch-triangulation-groups

I have not utilized this tool extensively but it looks very interesting. Unfortunately, there is no explanation or help function available for what this tool is displaying or how to understand and interpret the results. Hopefully, that will be added soon, as I think it would be possible to misinterpret the output without educational material.

GedMatch also introduced their “Evil Twin” tool, which made me laugh when I saw the name.  Using parental phasing, you can phase your DNA to your parent or parents at GedMatch, creating kits that only have your mother’s half of your DNA, or your father’s half.  These phased kits allow you to see your matches that come from that parent, only.  However, the “Evil Twin” feature creates a kit made up of the DNA that you DIDN’T receive from that parent – so in essence it’s your other half, your evil twin – you know, that person who got blamed for everything you “didn’t do.”  In any case, this allows you to see the matches to the other half of your parent’s DNA that do not show up as your matches.

Truthfully, the Evil Twin tool is interesting, but since you have to have that parent’s DNA to phase against in the first place, it’s just as easy to look at your parent’s matches – at least for me.

Others offer unique tools that are a bit different.

DNAadoption.com offers tools, search and research techniques, especially for adoptees and those looking to identify a parent or grandparents, but perhaps even more important, they offer genetic genealogy classes including basic and introductory.

I send all adoptees in their direction, but I encourage everyone to utilize their classes.

WikiTree has continued to develop and enhance their DNA offerings.  While WikiTree is not a testing service nor do they offer autosomal data tools like Family Tree DNA and GedMatch, they do allow individuals to discover whether anyone in their ancestral line has tested their Y, mitochondrial or autosomal DNA.

Specifically, you can identify the haplogroup of any male or female ancestor if another individual from that direct lineage has tested and provided that information for that ancestor on WikiTree.  While I am generally not a fan of the “one world tree” types of implementations, I am a fan of WikiTree because of their far-sighted DNA comparisons, the fact that they actively engage their customers, they listen and they expend a significant amount of effort making sure they “get it right,” relative to DNA. Check out WikiTree’s article,  Putting DNA Results Into Action, for how to utilize their DNA Features.

2016-wikitree-peter-roberts

Thanks particularly to Chris Whitten at WikiTree and Peter Roberts for their tireless efforts.  WikiTree is the only vendor to offer the ability to discover the Y and mtDNA haplogroups of ancestors by searching trees.

All of the people creating the tools mentioned above, to the best of my knowledge, are primarily volunteers, although GedMatch does charge a small subscription service for their high end tools, including the triangulation and evil twin tools.  DNAGedcom does as well.  Wikitree generates some revenue for the site through ads on pages of non-members. DNAAdoption charges nominally for classes but they do have need-based scholarships. Kitty has a donation link on her website and all of these folks would gladly accept donations, I’m sure.  Websites and everything that goes along with them aren’t free.  Donations are a nice way to say thank you.

What Defined 2016

I have noticed two trends in the genetic genealogy industry in 2016, and they are intertwined – ethnicity and education.

First, there is an avalanche of new testers, many of whom are not genetic genealogists.

Why would one test if they weren’t a genetic genealogist?

The answer is simple…

Ethnicity.

Or more specifically, the targeted marketing of ethnicity.  Ethnicity testing looks like an easy, quick answer to a basic human question, and it sells kits.

Ethnicity

“Kim just wanted to know who she was.”

I have to tell you, these commercials absolutely make me CRINGE.

Yes, they do bring additional testers into the community, BUT carrying significantly misset expectations. If you’re wondering about WHY I would suggest that ethnicity results really cannot tell you “who you are,” check out this article about ethnicity estimates.

And yes, that’s what they are, estimates – very interesting estimates, but estimates just the same.  Estimates that provide important and valid hints and clues, but not definitive answers.

ESTIMATES.

Nothing more.

Estimates based on proprietary vendor algorithms that tend to be fairly accurate at the continental level, and not so much within continents – in particular, not terribly accurate within Europe. Not all of this can be laid a the vendor’s feet.  For example, DNA testing is illegal in France.  Not to mention, genetic genealogy and population genetics is still a new and emerging field.  We’re on the frontier, folks.

The ethnicity results one receives from the 3 major vendors (Ancestry, Family Tree DNA and 23andMe) and the various tools at GedMatch don’t and won’t agree – because they use different reference populations, different matching routines, etc.  Not to mention people and populations move around and have moved around.

The next thing that happens, after these people receive their results, is that we find them on the Facebook groups asking questions like, “Why doesn’t my full blooded Native American grandmother show up?” and “I just got my Ancestry results back. What do I do?”  They mean that question quite literally.

I’m not making fun of these people, or light of the situation. Their level of frustration and confusion is evident. I feel sorry for them…but the genetic genealogy community and the rest of us are left with applying ointment and Band-Aids.  Truthfully, we’re out-numbered.

Because of the expectations, people who test today don’t realize that genetic testing is a TOOL, it’s not an ANSWER. It’s only part of the story. Oh, and did I mention, ethnicity is only an ESTIMATE!!!

But an estimate isn’t what these folks are expecting. They are expecting “the answer,” their own personal answer, which is very, very unfortunate, because eventually they are either unhappy or blissfully unaware.

Many become unhappy because they perceive the results to be in error without understanding anything about the technology or what information can reasonably be delivered, or they swallow “the answer” lock stock and barrel, again, without understanding anything about the technology.

Ethnicity is fun, it isn’t “bad” but the results need to be evaluated in context with other information, such as Y and mitochondrial haplogroups, genealogical records and ethnicity results from the other major testing companies.

Fortunately, we can recruit some of the ethnicity testers to become genealogists, but that requires education and encouragement. Let’s hope that those DNA ethnicity results light the fires of curiosity and that we can fan those flames!

Education

The genetic genealogy community desperately needs educational resources, in part as a result of the avalanche of new testers – approximately 1 million a year, and that estimate may be low. Thankfully, we do have several education options – but we can always use more.  Unfortunately, the learning curve is rather steep.

My blog offers just shy of 800 articles, all key word searchable, but one has to first find the blog and want to search and learn, as opposed to being handed “the answer.”

Of course, the “Help” link is always a good place to start as are these articles, DNA Testing for Genealogy 101 and Autosomal DNA Testing 101.  These two articles should be “must reads” for everyone who has DNA tested, or wants to, for that matter.  Tips and Tricks for Contact Success is another article that is immensely helpful to people just beginning to reach out.

In order to address the need for basic understanding of autosomal DNA principles, tools and how to utilize them, I began the “Concepts” series in February 2016. To date I offer the following 15 articles about genetic genealogy concepts. To be clear, DNA testing is only the genetic part of genetic genealogy, the genealogical research part being the second half of the equation.

My blog isn’t the only resource of course.

Kelly Wheaton provides 19 free lessons in her Beginners Guide to Genetic Genealogy.

Other blogs I highly recommend include:

Excellent books in print that should be in every genetic genealogist’s library:

And of course, the ISOGG Wiki.

Online Conference Resources

The good news and bad news is that I’m constantly seeing a genetic genealogy seminar, webinar or symposium hosted by a group someplace that is online, and often free. When I see names I recognize as being reputable, I am delighted that there is so much available to people who want to learn.

And for the record, I think that includes everyone. Even professional genetic genealogists watch these sessions, because you just never know what wonderful tidbit you’re going to pick up.  Learning, in this fast moving field, is an everyday event.

The bad news is that I can’t keep track of everything available, so I don’t mean to slight any resource.  Please feel free to post additional resources in the comments.

You would be hard pressed to find any genealogy conference, anyplace, today that didn’t include at least a few sessions about genetic genealogy. However, genetic genealogy has come of age and has its own dedicated conferences.

Dr. Maurice Gleeson, the gentleman who coordinates Genetic Genealogy Ireland films the sessions at the conference and then makes them available, for free, on YouTube. This link provides a list of the various sessions from 2016 and past years as well. Well worth your time!  A big thank you to Maurice!!!

The 19 video series from the I4GG Conference this fall is now available for $99. This series is an excellent opportunity for genetic genealogy education.

As always, I encourage project administrators to attend the Family Tree DNA International Conference on Genetic Genealogy. The sessions are not filmed, but the slides are made available after the conference, courtesy of the presenters and Family Tree DNA. You can view the presentations from 2015 and 2016 at this link.

Jennifer Zinck attended the conference and published her excellent notes here and here, if you want to read what she had to say about the sessions she attended. Thankfully, she can type much faster and more accurately than I can! Thank you so much Jennifer.

If you’d like to read about the unique lifetime achievement awards presented at the conference this year to Bennett Greenspan and Max Blankfeld, the founders of Family Tree DNA, click here. They were quite surprised!  This article also documents the history of genetic genealogy from the beginning – a walk down memory lane.

The 13th annual Family Tree DNA conference which will be held November 10-12, 2017 at the Hyatt Regency North Houston. Registration is always limited due to facility size, so mark your calendars now, watch for the announcement and be sure to register in time.

Summary

2016 has been an extremely busy year. I think my blog has had more views, more comments and by far, more questions, than ever before.

I’ve noticed that the membership in the ISOGG Facebook group, dedicated to genetic genealogy, has increased by about 50% in the past year, from roughly 8,000 members to just under 12,000. Other social media groups have been formed as well, some focused on specific aspects of genetic genealogy, such as specific surnames, adoption search, Native American or African American heritage and research.

The genetic aspect of genealogy has become “normal” today, with most genealogists not only accepting DNA testing, but embracing the various tools and what they can do for us in terms of understanding our ancestors, tracking them, and verifying that they are indeed who we think they are.

I may have to explain the three basic kinds of DNA testing and how they are used today, but no longer do I have to explain THAT DNA testing for genealogy exists and that it’s legitimate.

I hope that each of us can become an ambassador for genetic genealogy, encouraging others to test, with appropriate expectations, and helping to educate, enlighten and encourage. After all, the more people who test and are excited about the results, the better for everyone else.

Genetic genealogy is and can only be a collaborative team sport.

Here’s wishing you many new cousins and discoveries in 2017.

Happy New Year!!!

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research